[ https://issues.apache.org/jira/browse/HDFS-1780?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Todd Lipcon resolved HDFS-1780. ------------------------------- Resolution: Fixed Hadoop Flags: [Reviewed] > reduce need to rewrite fsimage on statrtup > ------------------------------------------ > > Key: HDFS-1780 > URL: https://issues.apache.org/jira/browse/HDFS-1780 > Project: Hadoop HDFS > Issue Type: Sub-task > Affects Versions: Edit log branch (HDFS-1073) > Reporter: Daryn Sharp > Assignee: Todd Lipcon > Fix For: Edit log branch (HDFS-1073) > > Attachments: hdfs-1780.txt, hdfs-1780.txt > > > On startup, the namenode will read the fs image, apply edits, then rewrite > the fs image. This requires a non-trivial amount of time for very large > directory structures. Perhaps the namenode should employ some logic to > decide that the edits are simple enough that it doesn't warrant rewriting the > image back out to disk. > A few ideas: > Use the size of the edit logs, if the size is below a threshold, assume it's > cheaper to reprocess the edit log instead of writing the image back out. > Time the processing of the edits and if the time is below a defined > threshold, the image isn't rewritten. > Timing the reading of the image, and the processing of the edits. Base the > decision on the time it would take to write the image (a multiplier is > applied to the read time?) versus the time it would take to reprocess the > edits. If a certain threshold (perhaps percentage or expected time to > rewrite) is exceeded, rewrite the image. > Somethingalong the lines of the last suggestion may allow for defaults that > adapt for any size cluster, thus eliminating the need to keep tweaking a > cluster's settings based on its size. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira