[
https://issues.apache.org/jira/browse/HDFS-8578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15041792#comment-15041792
]
Joep Rottinghuis commented on HDFS-8578:
----------------------------------------
If I read the patch correctly the parallelism will be the total number of
storage directories:
{code}
int numParallelThreads = dataDirs.size();
{code}
With 12 disks and 3 namespaces that would mean 36 parallel threads right?
Perhaps it would be better to make this configurable. It can default to 0,
meaning as parallel as possible, or any other explicit value set (up to #
storage directories, although the newFixedThreadPool with more threads than
running would simply not run more in parallel anyway). That way cluster admins
can choose to either dial this all the way up, hammer the disks and pagecache
and get it over with as soon as possible, or perhaps tune it down a bit in case
they choose to keep the NM up and executing tasks in the meantime.
I can imagine that 12 parallel threads (1 per disk in the above example) might
turn out to be a reasonable compromise for some use cases.
> On upgrade, Datanode should process all storage/data dirs in parallel
> ---------------------------------------------------------------------
>
> Key: HDFS-8578
> URL: https://issues.apache.org/jira/browse/HDFS-8578
> Project: Hadoop HDFS
> Issue Type: Improvement
> Components: datanode
> Reporter: Raju Bairishetti
> Assignee: Vinayakumar B
> Priority: Critical
> Attachments: HDFS-8578-01.patch, HDFS-8578-02.patch,
> HDFS-8578-03.patch, HDFS-8578-04.patch, HDFS-8578-05.patch,
> HDFS-8578-06.patch, HDFS-8578-07.patch, HDFS-8578-08.patch,
> HDFS-8578-09.patch, HDFS-8578-10.patch, HDFS-8578-11.patch,
> HDFS-8578-branch-2.6.0.patch
>
>
> Right now, during upgrades datanode is processing all the storage dirs
> sequentially. Assume it takes ~20 mins to process a single storage dir then
> datanode which has ~10 disks will take around 3hours to come up.
> *BlockPoolSliceStorage.java*
> {code}
> for (int idx = 0; idx < getNumStorageDirs(); idx++) {
> doTransition(datanode, getStorageDir(idx), nsInfo, startOpt);
> assert getCTime() == nsInfo.getCTime()
> : "Data-node and name-node CTimes must be the same.";
> }
> {code}
> It would save lots of time during major upgrades if datanode process all
> storagedirs/disks parallelly.
> Can we make datanode to process all storage dirs parallelly?
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)