[jira] [Commented] (HDFS-7784) load fsimage in parallel

Walter Su (JIRA) Fri, 13 Feb 2015 07:01:55 -0800

    [ 
https://issues.apache.org/jira/browse/HDFS-7784?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14320201#comment-14320201
 ]


Walter Su commented on HDFS-7784:
---------------------------------

I agree with you. A single Namenode with 64GB memory can hold about 100m 
files(maybe a little more). In this situation, The startup time drops from 371s 
to 159s and it's not good enough. Usually we don't restart Namenode often. So I 
think it's ok we wait another 2 minutes for restarting. 
If people store 10x or 100x more than 100m files, they should consider 
federation.
So I changed the priority to minor, and still I'll upload the patch, Maybe 
it'll help someone.

> load fsimage in parallel
> ------------------------
>
>                 Key: HDFS-7784
>                 URL: https://issues.apache.org/jira/browse/HDFS-7784
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: namenode
>            Reporter: Walter Su
>            Assignee: Walter Su
>            Priority: Minor
>         Attachments: HDFS-7784.001.patch, test-20150213.pdf
>
>
> When single Namenode has huge amount of files, without using federation, the 
> startup/restart speed is slow. The fsimage loading step takes the most of the 
> time. fsimage loading can seperate to two parts, deserialization and object 
> construction(mostly map insertion). Deserialization takes the most of CPU 
> time. So we can do deserialization in parallel, and add to hashmap in serial. 
>  It will significantly reduce the NN start time.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-7784) load fsimage in parallel

Reply via email to