[ 
https://issues.apache.org/jira/browse/HDFS-9260?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14986062#comment-14986062
 ] 

Staffan Friberg commented on HDFS-9260:
---------------------------------------

Hi Daryn,

Thanks for taking a look at the patch.

1. FBR and startup improves, please see the attached PDF.
2. Will need to check what we do here (and if I still have the old logs), but 
doesn't feel like it should be affected
3. We will be slightly slower when deleting a file or removing with the current 
algorithms as it goes through the LightWeightGSet to first lookup/remove each 
affected blockinfo, and after that remove it from the linked list. In my case 
it will be removed from treeset which requires a new lookup. However while this 
is slower I think the time it takes to that process is far outweighed by the 
time it takes for deleting or redistributing blocks on all DN. Deleting files 
with a large number of blocks seems to take on the order of hours since we only 
send small parts of the total block list to each node on every heartbeat. No to 
familiar with how aggressive the redistribution is in the event of a DN 
decommission.
4. It will decrease as long as the TreeSet is kept above ~50% fill ratio, since 
the reference to each blockinfo no is a single pointer from the treeset instead 
of the double linked list.

> Improve performance and GC friendliness of startup and FBRs
> -----------------------------------------------------------
>
>                 Key: HDFS-9260
>                 URL: https://issues.apache.org/jira/browse/HDFS-9260
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: datanode, namenode, performance
>    Affects Versions: 2.7.1
>            Reporter: Staffan Friberg
>            Assignee: Staffan Friberg
>         Attachments: HDFS Block and Replica Management 20151013.pdf, 
> HDFS-7435.001.patch, HDFS-7435.002.patch, HDFS-7435.003.patch, 
> HDFS-7435.004.patch, HDFS-7435.005.patch, HDFS-7435.006.patch, 
> HDFS-7435.007.patch, HDFS-9260.008.patch, HDFS-9260.009.patch
>
>
> This patch changes the datastructures used for BlockInfos and Replicas to 
> keep them sorted. This allows faster and more GC friendly handling of full 
> block reports.
> Would like to hear peoples feedback on this change and also some help 
> investigating/understanding a few outstanding issues if we are interested in 
> moving forward with this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to