[ 
https://issues.apache.org/jira/browse/HDFS-1780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13065665#comment-13065665
 ] 

Hudson commented on HDFS-1780:
------------------------------

Integrated in Hadoop-Hdfs-1073-branch #9 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-1073-branch/9/])
    HDFS-1780. Reduce need to rewrite FSImage on startup. Contributed by Todd 
Lipcon.

todd : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1146858
Files : 
* 
/hadoop/common/branches/HDFS-1073/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/FSImageTransactionalStorageInspector.java
* /hadoop/common/branches/HDFS-1073/hdfs/CHANGES.HDFS-1073.txt
* 
/hadoop/common/branches/HDFS-1073/hdfs/src/java/org/apache/hadoop/hdfs/server/namenode/FSImage.java
* 
/hadoop/common/branches/HDFS-1073/hdfs/src/test/hdfs/org/apache/hadoop/hdfs/server/namenode/TestEditLog.java


> reduce need to rewrite fsimage on statrtup
> ------------------------------------------
>
>                 Key: HDFS-1780
>                 URL: https://issues.apache.org/jira/browse/HDFS-1780
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>    Affects Versions: Edit log branch (HDFS-1073)
>            Reporter: Daryn Sharp
>            Assignee: Todd Lipcon
>             Fix For: Edit log branch (HDFS-1073)
>
>         Attachments: hdfs-1780.txt, hdfs-1780.txt
>
>
> On startup, the namenode will read the fs image, apply edits, then rewrite 
> the fs image.  This requires a non-trivial amount of time for very large 
> directory structures.  Perhaps the namenode should employ some logic to 
> decide that the edits are simple enough that it doesn't warrant rewriting the 
> image back out to disk.
> A few ideas:
> Use the size of the edit logs, if the size is below a threshold, assume it's 
> cheaper to reprocess the edit log instead of writing the image back out.
> Time the processing of the edits and if the time is below a defined 
> threshold, the image isn't rewritten.
> Timing the reading of the image, and the processing of the edits.  Base the 
> decision on the time it would take to write the image (a multiplier is 
> applied to the read time?) versus the time it would take to reprocess the 
> edits.  If a certain threshold (perhaps percentage or expected time to 
> rewrite) is exceeded, rewrite the image.
> Somethingalong the lines of the last suggestion may allow for defaults that 
> adapt for any size cluster, thus eliminating the need to keep tweaking a 
> cluster's settings based on its size.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to