[ 
https://issues.apache.org/jira/browse/HDFS-8865?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14695574#comment-14695574
 ] 

Xiaoyu Yao commented on HDFS-8865:
----------------------------------

Thanks for the patch, [~kihwal]! It looks pretty good to me. 

Just a few comments:
1. The number for large namespace looks impressive. Do you have the number for 
small/medium namespace? 

2. Is it possible to add some profiling info between these logs below so that 
we can easily find how long it takes to finish quota initialization from the 
log?
{code}
LOG.info("Initializing quota with " + threads + " thread(s)");

...
LOG.info("Quota initialization complete.\n" + counts);
{code}

3. Can you change to parameterized logging to avoid parameter construction in 
case the log statement is disabled. For example, 
{code}
LOG.debug("Setting quota for {} +\n{}", dir,  myCounts);
{code}

4. NIT: typo chached -> cached?
{code}
// Directly access the name system to obtain the current chached usage.
{code}

5. Now that HDFS-8879 is in, can you rebase and update the patch? Thanks!

> Improve quota initialization performance
> ----------------------------------------
>
>                 Key: HDFS-8865
>                 URL: https://issues.apache.org/jira/browse/HDFS-8865
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>            Reporter: Kihwal Lee
>            Assignee: Kihwal Lee
>         Attachments: HDFS-8865.patch, HDFS-8865.v2.checkstyle.patch, 
> HDFS-8865.v2.patch
>
>
> After replaying edits, the whole file system tree is recursively scanned in 
> order to initialize the quota. For big name space, this can take a very long 
> time.  Since this is done during namenode failover, it also affects failover 
> latency.
> By using the Fork-Join framework, I was able to greatly reduce the 
> initialization time.  The following is the test result using the fsimage from 
> one of the big name nodes we have.
> || threads || seconds||
> | 1 (existing) | 55|
> | 1 (fork-join) | 68 |
> | 4 | 16 |
> | 8 | 8 |
> | 12 | 6 |
> | 16 | 5 |
> | 20 | 4 |



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to