[ 
https://issues.apache.org/jira/browse/HDFS-5757?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13977632#comment-13977632
 ] 

Ming Ma commented on HDFS-5757:
-------------------------------

It seems property dfs.namenode.decommission.nodes.per.interval already support 
this, you can configure how many DNs to check each time DecommissionManager 
takes the writeLock. Perhaps we can do something similar to DataNodeManager's 
refreshNodes.

DataNodeManager's refreshNodes takes the writeLock when it kicks off the 
decommission process. We can modify refreshNodes to kick off several nodes each 
time writeLock is acquired. refreshNodes can be modified to return the RPC 
request quickly without waiting for DataNodeManager to finish the process.

There are other less important scenarios, a. if a machine has lots of blocks, 
this could still hold NN's writeLock for sometime. b. Even when there is 
nothing being decommissioned, DecommissionManager still takes writeLock and 
wall through all DNs.

> Decommisson lots of nodes at the same time could slow down NN
> -------------------------------------------------------------
>
>                 Key: HDFS-5757
>                 URL: https://issues.apache.org/jira/browse/HDFS-5757
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: namenode
>            Reporter: Ming Ma
>
> Sometimes we need to decomm a whole rack of nodes at the same time. When the 
> decomm is in process; NN is slow.
> The reason is when DecommissionManager checks the decomm status, it acquires 
> namesystem's writer lock and iterates through all DNs; for each DN that is in 
> decommissioning state, it check if replication is done for all the blocks on 
> the machine via blockManager.isReplicationInProgress; for large cluster; the 
> number of blocks on the machine could be big.
> The fix could be to have DecommissionManager check for several 
> decomm-in-progress nodes each time it aquires namesystem's writer lock.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to