[jira] Updated: (HDFS-1071) savenamespace should write the fsimage to all configured fs.name.dir in parallel

Dmytro Molkov (JIRA) Tue, 01 Jun 2010 13:49:04 -0700

     [ 
https://issues.apache.org/jira/browse/HDFS-1071?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Dmytro Molkov updated HDFS-1071:
--------------------------------

    Attachment: HDFS-1071.4.patch

I added the test that also does saveNamespace on the running cluster and then 
checks the image files written.
As far as the locking is concerned saveNamespace can only be done when in 
safemode, so we are only performing read operations on the datastructure that 
is essentially read only and while the parent thread is holding a lock, right?

And for the performance: the current image in our biggest cluster is ~11G and 
it takes 1.5-2 minutes to write it out to disk and filer each. In case of 
parallel writes those latencies are completely overlayed, so it will take 1.5-2 
minutes for both. Which will give us about 1.5 minutes savings (80-100 seconds 
is the time of the faster write).

> savenamespace should write the fsimage to all configured fs.name.dir in 
> parallel
> --------------------------------------------------------------------------------
>
>                 Key: HDFS-1071
>                 URL: https://issues.apache.org/jira/browse/HDFS-1071
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: name-node
>            Reporter: dhruba borthakur
>            Assignee: Dmytro Molkov
>         Attachments: HDFS-1071.2.patch, HDFS-1071.3.patch, HDFS-1071.4.patch, 
> HDFS-1071.patch
>
>
> If you have a large number of files in HDFS, the fsimage file is very big. 
> When the namenode restarts, it writes a copy of the fsimage to all 
> directories configured in fs.name.dir. This takes a long time, especially if 
> there are many directories in fs.name.dir. Make the NN write the fsimage to 
> all these directories in parallel.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HDFS-1071) savenamespace should write the fsimage to all configured fs.name.dir in parallel

Reply via email to