[ 
https://issues.apache.org/jira/browse/HDFS-8578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14622173#comment-14622173
 ] 

Raju Bairishetti commented on HDFS-8578:
----------------------------------------

Thanks [~vinayrpet] for the quick response and with the patch. 

bq. Can we shutdown the ExecutorService once the work is done?
[~vinayrpet] Can we handle this case if you feel it is correct?  

One more minor ask: Do we need to validate the config value which is provided 
by user? Like the config value is valid or not.

{code}
int numParallelThreads = datanode.getConf().getInt(
+        DFSConfigKeys.DFS_DATANODE_PARALLEL_VOLUME_LOAD_THREADS_NUM_KEY,
+        dataDirs.size());
{code}

I feel we should not honor the config value if it is more than number of 
volumes on the datanode. Can we calculate the num of threads by checking the 
minimum number between user provided value and number of total volumes  (i.e. 
min(userProvidedValue, dataDirs.size()))? Usually users keep single config 
value across all datanodes but number of disks on datanodes can be vary. In 
such cases we may end up spawning more threads than necessary.

> On upgrade, Datanode should process all storage/data dirs in parallel
> ---------------------------------------------------------------------
>
>                 Key: HDFS-8578
>                 URL: https://issues.apache.org/jira/browse/HDFS-8578
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: datanode
>            Reporter: Raju Bairishetti
>            Priority: Critical
>         Attachments: HDFS-8578-01.patch, HDFS-8578-02.patch, 
> HDFS-8578-03.patch, HDFS-8578-branch-2.6.0.patch
>
>
> Right now, during upgrades datanode is processing all the storage dirs 
> sequentially. Assume it takes ~20 mins to process a single storage dir then  
> datanode which has ~10 disks will take around 3hours to come up.
> *BlockPoolSliceStorage.java*
> {code}
>    for (int idx = 0; idx < getNumStorageDirs(); idx++) {
>       doTransition(datanode, getStorageDir(idx), nsInfo, startOpt);
>       assert getCTime() == nsInfo.getCTime() 
>           : "Data-node and name-node CTimes must be the same.";
>     }
> {code}
> It would save lots of time during major upgrades if datanode process all 
> storagedirs/disks parallelly.
> Can we make datanode to process all storage dirs parallelly?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to