[
https://issues.apache.org/jira/browse/HDFS-8865?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16497961#comment-16497961
]
Kihwal Lee commented on HDFS-8865:
----------------------------------
The memory overhead is very small compared to the memory required to hold the
name space. The creation of child tasks is somwwhat throttled by the threadpool
size. I.e. it does not explode walking entire namespace at once. Be sure to
port HDFS-9003 with it.
> Improve quota initialization performance
> ----------------------------------------
>
> Key: HDFS-8865
> URL: https://issues.apache.org/jira/browse/HDFS-8865
> Project: Hadoop HDFS
> Issue Type: Improvement
> Reporter: Kihwal Lee
> Assignee: Kihwal Lee
> Priority: Major
> Fix For: 2.8.0, 3.0.0-alpha1, 2.6.6, 2.7.5
>
> Attachments: HDFS-8865.branch-2.6.01.patch,
> HDFS-8865.branch-2.6.patch, HDFS-8865.branch-2.7.patch, HDFS-8865.patch,
> HDFS-8865.v2.checkstyle.patch, HDFS-8865.v2.patch, HDFS-8865.v3.patch,
> HDFS-8865_branch-2.6.patch, HDFS-8865_branch-2.7.patch
>
>
> After replaying edits, the whole file system tree is recursively scanned in
> order to initialize the quota. For big name space, this can take a very long
> time. Since this is done during namenode failover, it also affects failover
> latency.
> By using the Fork-Join framework, I was able to greatly reduce the
> initialization time. The following is the test result using the fsimage from
> one of the big name nodes we have.
> || threads || seconds||
> | 1 (existing) | 55|
> | 1 (fork-join) | 68 |
> | 4 | 16 |
> | 8 | 8 |
> | 12 | 6 |
> | 16 | 5 |
> | 20 | 4 |
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]