[ 
https://issues.apache.org/jira/browse/HDFS-1071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12882000#action_12882000
 ] 

Konstantin Shvachko commented on HDFS-1071:
-------------------------------------------

So this has nothing to do with the safe mode then.
As I understand the main thread holds the global (FSNamesystem) lock, and 
nothing else is going to be executed on the NN at that time. This seems to be 
the answer to the locking question. Could you please add JavaDoc to 
{{FSImageSaver}} class with the summary of the locking and the image identity  
issues.

The approach with one serializing thread and others doing writes is in fact not 
harder. The queue growth issue is not a problem imo. The speed of the total 
write depends on the slowest writer, so everybody can simply wait until the 
slowest guy completes the assignment, they will have to wait for him anyways in 
the end.
The advantage of this approach is that we guarantee everybody writes the 
exactly same bytes into the image files.
With your approach, although the implementation is simpler, it is not obvious 
the contents of the image files will be the same, well at least was not for me.

Out of pure curiosity if you know is there any benefit of multithreaded writing 
to directories on the same drive.

> savenamespace should write the fsimage to all configured fs.name.dir in 
> parallel
> --------------------------------------------------------------------------------
>
>                 Key: HDFS-1071
>                 URL: https://issues.apache.org/jira/browse/HDFS-1071
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: name-node
>            Reporter: dhruba borthakur
>            Assignee: Dmytro Molkov
>         Attachments: HDFS-1071.2.patch, HDFS-1071.3.patch, HDFS-1071.4.patch, 
> HDFS-1071.5.patch, HDFS-1071.patch
>
>
> If you have a large number of files in HDFS, the fsimage file is very big. 
> When the namenode restarts, it writes a copy of the fsimage to all 
> directories configured in fs.name.dir. This takes a long time, especially if 
> there are many directories in fs.name.dir. Make the NN write the fsimage to 
> all these directories in parallel.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to