[
https://issues.apache.org/jira/browse/HDFS-8578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15045900#comment-15045900
]
Chris Trezzo commented on HDFS-8578:
------------------------------------
One possible solution to limit memory growth is to rewrite
{{DataStorage#linkBlocks}} and {{DataStorage#linkBlocksHelper}} to use a
producer/consumer type model with a bounded queue. For example, you could use a
[LinkedBlockingQueue|https://docs.oracle.com/javase/7/docs/api/java/util/concurrent/LinkedBlockingQueue.html]
with a fixed capacity. The logic roughly in the
{{DataStorage#linkBlocksHelper}} method would be the producer that adds
LinkArgs objects to the queue. The logic in the linkWorkers ExecutorService
would simply do what it does now, except it would pull LinkArgs objects out of
the queue.
[~vinayrpet] Do you want to take a crack at this? If you don't have time in the
next day or so let me know and I will take a look.
> On upgrade, Datanode should process all storage/data dirs in parallel
> ---------------------------------------------------------------------
>
> Key: HDFS-8578
> URL: https://issues.apache.org/jira/browse/HDFS-8578
> Project: Hadoop HDFS
> Issue Type: Improvement
> Components: datanode
> Reporter: Raju Bairishetti
> Assignee: Vinayakumar B
> Priority: Critical
> Attachments: HDFS-8578-01.patch, HDFS-8578-02.patch,
> HDFS-8578-03.patch, HDFS-8578-04.patch, HDFS-8578-05.patch,
> HDFS-8578-06.patch, HDFS-8578-07.patch, HDFS-8578-08.patch,
> HDFS-8578-09.patch, HDFS-8578-10.patch, HDFS-8578-11.patch,
> HDFS-8578-12.patch, HDFS-8578-branch-2.6.0.patch
>
>
> Right now, during upgrades datanode is processing all the storage dirs
> sequentially. Assume it takes ~20 mins to process a single storage dir then
> datanode which has ~10 disks will take around 3hours to come up.
> *BlockPoolSliceStorage.java*
> {code}
> for (int idx = 0; idx < getNumStorageDirs(); idx++) {
> doTransition(datanode, getStorageDir(idx), nsInfo, startOpt);
> assert getCTime() == nsInfo.getCTime()
> : "Data-node and name-node CTimes must be the same.";
> }
> {code}
> It would save lots of time during major upgrades if datanode process all
> storagedirs/disks parallelly.
> Can we make datanode to process all storage dirs parallelly?
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)