[ 
https://issues.apache.org/jira/browse/HDFS-16613?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17550907#comment-17550907
 ] 

caozhiqiang commented on HDFS-16613:
------------------------------------

[~hadachi] , thank you for your review.

Firstly, my hadoop branch has included HDFS-14768. In my test, even the 
decommissioning node is made busy, ec blocks will not be reconstructed. It 
would not send ec task to datanode and only be reserved in 
BlockManager::pendingReconstruction. After timeout, these blocks will be put 
back to BlockManager::neededReconstruction and be rescheduled next time. So all 
blocks use replication on decommissioning node but not reconstruction. By the 
way, I decommission only one dn at a time.

Secondly, there are 12 datanodes in my cluster, and each dn has 12 disks. There 
are 27217 ec block groups in my cluster and about 20000 blocks in one datanode. 
Other nodes' load are very low beside the decommissioning node, include load 
average, cpu iowait and network.

!image-2022-06-07-17-55-40-203.png|width=772,height=192!

!image-2022-06-07-17-45-45-316.png|width=772,height=198!

!image-2022-06-07-17-51-04-876.png|width=769,height=256!

> EC: Improve performance of decommissioning dn with many ec blocks
> -----------------------------------------------------------------
>
>                 Key: HDFS-16613
>                 URL: https://issues.apache.org/jira/browse/HDFS-16613
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: ec, erasure-coding, namenode
>    Affects Versions: 3.4.0
>            Reporter: caozhiqiang
>            Assignee: caozhiqiang
>            Priority: Major
>              Labels: pull-request-available
>         Attachments: image-2022-06-07-11-46-42-389.png, 
> image-2022-06-07-17-42-16-075.png, image-2022-06-07-17-45-45-316.png, 
> image-2022-06-07-17-51-04-876.png, image-2022-06-07-17-55-40-203.png
>
>          Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> In a hdfs cluster with a lot of EC blocks, decommission a dn is very slow. 
> The reason is unlike replication blocks can be replicated from any dn which 
> has the same block replication, the ec block have to be replicated from the 
> decommissioning dn.
> The configurations dfs.namenode.replication.max-streams and 
> dfs.namenode.replication.max-streams-hard-limit will limit the replication 
> speed, but increase these configurations will create risk to the whole 
> cluster's network. So it should add a new configuration to limit the 
> decommissioning dn, distinguished from the cluster wide max-streams limit.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

Reply via email to