[ 
https://issues.apache.org/jira/browse/HDFS-2982?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13280332#comment-13280332
 ] 

Colin Patrick McCabe commented on HDFS-2982:
--------------------------------------------

I renamed the resync parameter to skipBrokenEdits, since that's what it is in 
Reader#readOp, and this function just passes it on to there.  That is a pretty 
concise description of what it does.

The changes to TestNameNodeRecovery are for correctness.  Formerly, we were 
doing multiple mkdirs operations on the same directory.  This resulted in only 
one mkdir operation getting added to the stream.  Then when we corrupted the 
last edit, the mkdir operation was lost-- a bad thing, since we check for it 
later.

I'm not 100% sure if calling cluster.waitActive() in this test is necessary, 
since we have 0 DataNodes.  However, we do it everywhere else, and consistency 
is a good thing.  Also, conceptually what we want is for the NameNode to come 
up and be active.  It seems more robust to check for that directly rather than 
assuming that no part of edit log loading happens in the background.
                
> Startup performance suffers when there are many edit log segments
> -----------------------------------------------------------------
>
>                 Key: HDFS-2982
>                 URL: https://issues.apache.org/jira/browse/HDFS-2982
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: name-node
>    Affects Versions: 2.0.0
>            Reporter: Todd Lipcon
>            Assignee: Colin Patrick McCabe
>            Priority: Critical
>         Attachments: HDFS-2982.001.patch, HDFS-2982.002.patch, 
> HDFS-2982.003.patch, HDFS-2982.004.patch, HDFS-2982.005.patch, 
> HDFS-2982.006.patch
>
>
> For every one of the edit log segments, it seems like we are calling 
> listFiles on the edit log directory inside of {{findMaxTransaction}}. This is 
> killing performance, especially when there are many log segments and the 
> directory is stored on NFS. It is taking several minutes to start up the NN 
> when there are several thousand log segments present.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to