[
https://issues.apache.org/jira/browse/HDFS-2982?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13278582#comment-13278582
]
Eli Collins commented on HDFS-2982:
-----------------------------------
Hey Colin,
Took a quick look. How about describing the high-level approach in the patch?
- The javadoc for JournalSet#selectInputStreams is a little over-simplified =)
- how about describing the algorithm (get the streams starting with fromTxid
from all managers, return a list sorted by the starting txid etc)
- In EditLogFileInputStream#init why only close the stream that threw?
- In TestEditLog readAllEdits is dead code
> Startup performance suffers when there are many edit log segments
> -----------------------------------------------------------------
>
> Key: HDFS-2982
> URL: https://issues.apache.org/jira/browse/HDFS-2982
> Project: Hadoop HDFS
> Issue Type: Bug
> Components: name-node
> Affects Versions: 2.0.0
> Reporter: Todd Lipcon
> Assignee: Colin Patrick McCabe
> Priority: Critical
> Attachments: HDFS-2982.001.patch
>
>
> For every one of the edit log segments, it seems like we are calling
> listFiles on the edit log directory inside of {{findMaxTransaction}}. This is
> killing performance, especially when there are many log segments and the
> directory is stored on NFS. It is taking several minutes to start up the NN
> when there are several thousand log segments present.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira