[jira] [Commented] (HDFS-8578) On upgrade, Datanode should process all storage/data dirs in parallel

Chris Trezzo (JIRA) Fri, 04 Dec 2015 10:38:07 -0800

    [ 
https://issues.apache.org/jira/browse/HDFS-8578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15041946#comment-15041946
 ]


Chris Trezzo commented on HDFS-8578:
------------------------------------

[~jrottinghuis]
bq. With 12 disks and 3 namespaces that would mean 36 parallel threads right?
Hard-linking is also parallelized within each of these threads (default is 12 
threads). So the maximum number of threads you would potentially see is 12 
disks (really it is storage directories, but let's assume there are 1 storage 
dir per disk) * 3 namespaces * 12 (default but configurable) hard-link worker 
threads = 432 threads.

bq. I had seen OOM errors with 2.6 release when it processed 6 disks in 
parallel.
[~raju.bairishetti] [~vinayrpet] Do you have a better sense of how much the 
memory footprint of the data node increases due to this parallelism?

The only other case that I see where we do a full scan of the storage 
directories for hard-linking is during the {{DataStorage#prepareVolume}} code 
path. This has already been parallelized in the {{DataNode#refreshVolumes}} 
method. The number of parallel threads in this case is 1 per changed storage 
location * 12 (default but configurable) hard-link worker threads.

I am attempting to revert one of our test clusters back to a 256x256 layout and 
will test this patch out on this cluster. Without parallelism datanodes on this 
cluster were each taking 1.5hrs to upgrade.

> On upgrade, Datanode should process all storage/data dirs in parallel
> ---------------------------------------------------------------------
>
>                 Key: HDFS-8578
>                 URL: https://issues.apache.org/jira/browse/HDFS-8578
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: datanode
>            Reporter: Raju Bairishetti
>            Assignee: Vinayakumar B
>            Priority: Critical
>         Attachments: HDFS-8578-01.patch, HDFS-8578-02.patch, 
> HDFS-8578-03.patch, HDFS-8578-04.patch, HDFS-8578-05.patch, 
> HDFS-8578-06.patch, HDFS-8578-07.patch, HDFS-8578-08.patch, 
> HDFS-8578-09.patch, HDFS-8578-10.patch, HDFS-8578-11.patch, 
> HDFS-8578-12.patch, HDFS-8578-branch-2.6.0.patch
>
>
> Right now, during upgrades datanode is processing all the storage dirs 
> sequentially. Assume it takes ~20 mins to process a single storage dir then  
> datanode which has ~10 disks will take around 3hours to come up.
> *BlockPoolSliceStorage.java*
> {code}
>    for (int idx = 0; idx < getNumStorageDirs(); idx++) {
>       doTransition(datanode, getStorageDir(idx), nsInfo, startOpt);
>       assert getCTime() == nsInfo.getCTime() 
>           : "Data-node and name-node CTimes must be the same.";
>     }
> {code}
> It would save lots of time during major upgrades if datanode process all 
> storagedirs/disks parallelly.
> Can we make datanode to process all storage dirs parallelly?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-8578) On upgrade, Datanode should process all storage/data dirs in parallel

Reply via email to