[jira] [Commented] (HDFS-9260) Improve performance and GC friendliness of startup and FBRs

Staffan Friberg (JIRA) Thu, 28 Jan 2016 14:41:15 -0800

    [ 
https://issues.apache.org/jira/browse/HDFS-9260?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15122460#comment-15122460
 ]


Staffan Friberg commented on HDFS-9260:
---------------------------------------

Hi [~jingzhao],

Thank you for your comments! Updated with the patch (version 16).

1. Done, moved to context
2. Done
3. Done, removed
4. I have started to look at this multiple times as I have been working on the 
patch, but have so far failed to find a simple way to separate it. The remove 
methods are so deeply linked when removing a block that I can't really figure 
out a clean way to lift it out, and if it was possible it would in itself be a 
fairly large change I believe. Let me know if you have any ideas.
5. Done, locking up directly in the map with a new Block(replicaID).
6. Done, removed
7. The reason for duplicating it is basically to avoid that the NN allocates 4 
LinkedLists as part of each block that is being reported in an IBR. Potentially 
one could change the fullBR to not rely on lists and simply add/remove as it 
finds entries. Two issues that needs to be thought about for this, how should 
logging be handled since some counting is done as part of number of handled 
blocks, and, is it better to have multiple loops with smaller code footprint 
than expanding the already large one with even more code to handle each case 
directly. I agree with you that it is bad with the two code paths, but I think 
it the reduction in allocation for IBRs could be worth it.
8. Done, I do the same checks I do in removeLeft/Right
9, 10. Good point. Is it required to hold the readlock around the loops, or 
would it be enough to just hold it around the inner most iteration that 
calculates the fragmentation for a storage. Would help reduce time 
significantly for the first iteration. Need to think a bit for about the second 
part when actually doing defragmentation on abort mechanism. What is an OK time 
limit? I saw 4ms being mentioned in HDFS-9198.

> Improve performance and GC friendliness of startup and FBRs
> -----------------------------------------------------------
>
>                 Key: HDFS-9260
>                 URL: https://issues.apache.org/jira/browse/HDFS-9260
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: datanode, namenode, performance
>    Affects Versions: 2.7.1
>            Reporter: Staffan Friberg
>            Assignee: Staffan Friberg
>         Attachments: FBR processing.png, HDFS Block and Replica Management 
> 20151013.pdf, HDFS-7435.001.patch, HDFS-7435.002.patch, HDFS-7435.003.patch, 
> HDFS-7435.004.patch, HDFS-7435.005.patch, HDFS-7435.006.patch, 
> HDFS-7435.007.patch, HDFS-9260.008.patch, HDFS-9260.009.patch, 
> HDFS-9260.010.patch, HDFS-9260.011.patch, HDFS-9260.012.patch, 
> HDFS-9260.013.patch, HDFS-9260.014.patch, HDFS-9260.015.patch, 
> HDFS-9260.016.patch, HDFSBenchmarks.zip, HDFSBenchmarks2.zip
>
>
> This patch changes the datastructures used for BlockInfos and Replicas to 
> keep them sorted. This allows faster and more GC friendly handling of full 
> block reports.
> Would like to hear peoples feedback on this change.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-9260) Improve performance and GC friendliness of startup and FBRs

Reply via email to