[ 
https://issues.apache.org/jira/browse/HDFS-1071?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dmytro Molkov updated HDFS-1071:
--------------------------------

    Attachment: HDFS-1071.6.patch

I added a documentation for FSImageSaver that describes initial assumptions for 
how the parallel writes are being done.

As far as writing the image to the single disk in multiple directories. If you 
do it in parallel that might only hurt the performance, since the disk will do 
seeks all the times.

> savenamespace should write the fsimage to all configured fs.name.dir in 
> parallel
> --------------------------------------------------------------------------------
>
>                 Key: HDFS-1071
>                 URL: https://issues.apache.org/jira/browse/HDFS-1071
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: name-node
>            Reporter: dhruba borthakur
>            Assignee: Dmytro Molkov
>         Attachments: HDFS-1071.2.patch, HDFS-1071.3.patch, HDFS-1071.4.patch, 
> HDFS-1071.5.patch, HDFS-1071.6.patch, HDFS-1071.patch
>
>
> If you have a large number of files in HDFS, the fsimage file is very big. 
> When the namenode restarts, it writes a copy of the fsimage to all 
> directories configured in fs.name.dir. This takes a long time, especially if 
> there are many directories in fs.name.dir. Make the NN write the fsimage to 
> all these directories in parallel.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to