[
https://issues.apache.org/jira/browse/HDFS-16613?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17545904#comment-17545904
]
caozhiqiang edited comment on HDFS-16613 at 6/3/22 3:30 PM:
------------------------------------------------------------
[~tasanuma] [~hadachi] , besides add a new configuration to limit
decommissioning dn separately, we also can use
dfs.namenode.replication.max-streams-hard-limit to impelements the same
purpose. We only need to modify DatanodeManager::handleHeartbeat() and use
dfs.namenode.replication.max-streams-hard-limit to give numReplicationTasks to
decommissioning dn. I created a new pr
[4398|https://github.com/apache/hadoop/pull/4398], please help to review it if
you are free.
{code:java}
int maxTransfers;
if (nodeinfo.isDecommissionInProgress()) {
maxTransfers = blockManager.getReplicationStreamsHardLimit()
- xmitsInProgress;
} else {
maxTransfers = blockManager.getMaxReplicationStreams()
- xmitsInProgress;
} {code}
was (Author: caozhiqiang):
[~tasanuma] [~hadachi] , besides add a new configuration to limit
decommissioning dn separately, we also can use
dfs.namenode.replication.max-streams-hard-limit to impelements the same
purpose. We only need to modify DatanodeManager::handleHeartbeat() and use
dfs.namenode.replication.max-streams-hard-limit to give numReplicationTasks to
decommissioning dn. I will create a new pr, please help to review it.
{code:java}
int maxTransfers;
if (nodeinfo.isDecommissionInProgress()) {
maxTransfers = blockManager.getReplicationStreamsHardLimit()
- xmitsInProgress;
} else {
maxTransfers = blockManager.getMaxReplicationStreams()
- xmitsInProgress;
} {code}
> EC: Improve performance of decommissioning dn with many ec blocks
> -----------------------------------------------------------------
>
> Key: HDFS-16613
> URL: https://issues.apache.org/jira/browse/HDFS-16613
> Project: Hadoop HDFS
> Issue Type: Improvement
> Components: ec, erasure-coding, namenode
> Affects Versions: 3.4.0
> Reporter: caozhiqiang
> Assignee: caozhiqiang
> Priority: Major
> Labels: pull-request-available
> Time Spent: 20m
> Remaining Estimate: 0h
>
> In a hdfs cluster with a lot of EC blocks, decommission a dn is very slow.
> The reason is unlike replication blocks can be replicated from any dn which
> has the same block replication, the ec block have to be replicated from the
> decommissioning dn.
> The configurations dfs.namenode.replication.max-streams and
> dfs.namenode.replication.max-streams-hard-limit will limit the replication
> speed, but increase these configurations will create risk to the whole
> cluster's network. So it should add a new configuration to limit the
> decommissioning dn, distinguished from the cluster wide max-streams limit.
--
This message was sent by Atlassian Jira
(v8.20.7#820007)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]