[ 
https://issues.apache.org/jira/browse/HDFS-7784?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15067073#comment-15067073
 ] 

Kihwal Lee commented on HDFS-7784:
----------------------------------

bq.  protobuf seems to generate a lot of garbage during startup, causing many 
full GCs which really consume a lot of time.
One of the large NNs used to do multiple full GCs during start-up, but mainly 
due to initial full block report processing. Ever since the young gen size was 
increased, it stopped doing it.  We initially feared the minor collection time 
would increase dramatically, but that wasn't the case.  Along with the increase 
YG size, we set {{-XX:ParGCCardsPerStrideChunk=32768}}.

We will look into javanano version. Thanks for the pointer.

> load fsimage in parallel
> ------------------------
>
>                 Key: HDFS-7784
>                 URL: https://issues.apache.org/jira/browse/HDFS-7784
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: namenode
>            Reporter: Walter Su
>            Assignee: Walter Su
>            Priority: Minor
>              Labels: BB2015-05-TBR
>         Attachments: HDFS-7784.001.patch, test-20150213.pdf
>
>
> When single Namenode has huge amount of files, without using federation, the 
> startup/restart speed is slow. The fsimage loading step takes the most of the 
> time. fsimage loading can seperate to two parts, deserialization and object 
> construction(mostly map insertion). Deserialization takes the most of CPU 
> time. So we can do deserialization in parallel, and add to hashmap in serial. 
>  It will significantly reduce the NN start time.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to