[jira] [Commented] (HDFS-8578) On upgrade, Datanode should process all storage/data dirs in parallel

Vinayakumar B (JIRA) Thu, 03 Dec 2015 07:18:50 -0800

    [ 
https://issues.apache.org/jira/browse/HDFS-8578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15037916#comment-15037916
 ]


Vinayakumar B commented on HDFS-8578:
-------------------------------------

bq. could we assign these values earlier at the time you load properties from 
all the StorageDirectories for getting datanodeUuid? This could make things 
more clear.
All properties are loaded while reading from VERSION file.
{code}// Load the VERSION file to know the datanodeUUID
 readProperties(sd);{code}

bq. Use multiple threads to write a shared field without locking is not a good 
choice although we do not have any problem right now. It implies that the 
method is thread safe so later people may add some dangerous code in the 
method...
Thats a good suggestion, thanks [~Apache9]. Made 
{{DataStorage#setFieldsFromProperties(..)}} synchronized.

> On upgrade, Datanode should process all storage/data dirs in parallel
> ---------------------------------------------------------------------
>
>                 Key: HDFS-8578
>                 URL: https://issues.apache.org/jira/browse/HDFS-8578
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: datanode
>            Reporter: Raju Bairishetti
>            Assignee: Vinayakumar B
>            Priority: Critical
>         Attachments: HDFS-8578-01.patch, HDFS-8578-02.patch, 
> HDFS-8578-03.patch, HDFS-8578-04.patch, HDFS-8578-05.patch, 
> HDFS-8578-06.patch, HDFS-8578-07.patch, HDFS-8578-08.patch, 
> HDFS-8578-09.patch, HDFS-8578-10.patch, HDFS-8578-11.patch, 
> HDFS-8578-branch-2.6.0.patch
>
>
> Right now, during upgrades datanode is processing all the storage dirs 
> sequentially. Assume it takes ~20 mins to process a single storage dir then  
> datanode which has ~10 disks will take around 3hours to come up.
> *BlockPoolSliceStorage.java*
> {code}
>    for (int idx = 0; idx < getNumStorageDirs(); idx++) {
>       doTransition(datanode, getStorageDir(idx), nsInfo, startOpt);
>       assert getCTime() == nsInfo.getCTime() 
>           : "Data-node and name-node CTimes must be the same.";
>     }
> {code}
> It would save lots of time during major upgrades if datanode process all 
> storagedirs/disks parallelly.
> Can we make datanode to process all storage dirs parallelly?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-8578) On upgrade, Datanode should process all storage/data dirs in parallel

Reply via email to