[ https://issues.apache.org/jira/browse/HDFS-16613?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17550940#comment-17550940 ]
Hiroyuki Adachi commented on HDFS-16613: ---------------------------------------- [~caozhiqiang] , thank you for your explanation. It looks good. Now I understand that the blocksToProcess controls the number of replication works, so if it is less than dfs.namenode.replication.max-streams-hard-limit, all blocks use replication on decommissioning node but not reconstruction. Could you please tell me the value of dfs.namenode.replication.max-streams-hard-limit and dfs.namenode.replication.work.multiplier.per.iteration? {code:java} // BlockManager#computeDatanodeWork final int blocksToProcess = numlive * this.blocksReplWorkMultiplier; final int nodesToProcess = (int) Math.ceil(numlive * this.blocksInvalidateWorkPct); int workFound = this.computeBlockReconstructionWork(blocksToProcess); {code} > EC: Improve performance of decommissioning dn with many ec blocks > ----------------------------------------------------------------- > > Key: HDFS-16613 > URL: https://issues.apache.org/jira/browse/HDFS-16613 > Project: Hadoop HDFS > Issue Type: Improvement > Components: ec, erasure-coding, namenode > Affects Versions: 3.4.0 > Reporter: caozhiqiang > Assignee: caozhiqiang > Priority: Major > Labels: pull-request-available > Attachments: image-2022-06-07-11-46-42-389.png, > image-2022-06-07-17-42-16-075.png, image-2022-06-07-17-45-45-316.png, > image-2022-06-07-17-51-04-876.png, image-2022-06-07-17-55-40-203.png > > Time Spent: 0.5h > Remaining Estimate: 0h > > In a hdfs cluster with a lot of EC blocks, decommission a dn is very slow. > The reason is unlike replication blocks can be replicated from any dn which > has the same block replication, the ec block have to be replicated from the > decommissioning dn. > The configurations dfs.namenode.replication.max-streams and > dfs.namenode.replication.max-streams-hard-limit will limit the replication > speed, but increase these configurations will create risk to the whole > cluster's network. So it should add a new configuration to limit the > decommissioning dn, distinguished from the cluster wide max-streams limit. -- This message was sent by Atlassian Jira (v8.20.7#820007) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org