[jira] [Commented] (HDFS-16613) EC: Improve performance of decommissioning dn with many ec blocks

2022-11-17 Thread caozhiqiang (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16613?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17635400#comment-17635400
 ] 

caozhiqiang commented on HDFS-16613:


[~tasanuma] , OK, I have create an issue 
[HDFS-16846|https://issues.apache.org/jira/browse/HDFS-16846]. Please help to 
review.

> EC: Improve performance of decommissioning dn with many ec blocks
> -
>
> Key: HDFS-16613
> URL: https://issues.apache.org/jira/browse/HDFS-16613
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: ec, erasure-coding, namenode
>Affects Versions: 3.4.0
>Reporter: caozhiqiang
>Assignee: caozhiqiang
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0
>
> Attachments: image-2022-06-07-11-46-42-389.png, 
> image-2022-06-07-17-42-16-075.png, image-2022-06-07-17-45-45-316.png, 
> image-2022-06-07-17-51-04-876.png, image-2022-06-07-17-55-40-203.png, 
> image-2022-06-08-11-38-29-664.png, image-2022-06-08-11-41-11-127.png
>
>  Time Spent: 2.5h
>  Remaining Estimate: 0h
>
> In a hdfs cluster with a lot of EC blocks, decommission a dn is very slow. 
> The reason is unlike replication blocks can be replicated from any dn which 
> has the same block replication, the ec block have to be replicated from the 
> decommissioning dn.
> The configurations dfs.namenode.replication.max-streams and 
> dfs.namenode.replication.max-streams-hard-limit will limit the replication 
> speed, but increase these configurations will create risk to the whole 
> cluster's network. So it should add a new configuration to limit the 
> decommissioning dn, distinguished from the cluster wide max-streams limit.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-16613) EC: Improve performance of decommissioning dn with many ec blocks

2022-11-16 Thread Takanobu Asanuma (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16613?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17635068#comment-17635068
 ] 

Takanobu Asanuma commented on HDFS-16613:
-

Thanks for your reply and the suggestion, [~caozhiqiang].

Given that the speed is not much improved in the case of replication, I prefer 
that this setting should only affect ec blocks. Could you please create another 
issue addressing it?

> EC: Improve performance of decommissioning dn with many ec blocks
> -
>
> Key: HDFS-16613
> URL: https://issues.apache.org/jira/browse/HDFS-16613
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: ec, erasure-coding, namenode
>Affects Versions: 3.4.0
>Reporter: caozhiqiang
>Assignee: caozhiqiang
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0
>
> Attachments: image-2022-06-07-11-46-42-389.png, 
> image-2022-06-07-17-42-16-075.png, image-2022-06-07-17-45-45-316.png, 
> image-2022-06-07-17-51-04-876.png, image-2022-06-07-17-55-40-203.png, 
> image-2022-06-08-11-38-29-664.png, image-2022-06-08-11-41-11-127.png
>
>  Time Spent: 2.5h
>  Remaining Estimate: 0h
>
> In a hdfs cluster with a lot of EC blocks, decommission a dn is very slow. 
> The reason is unlike replication blocks can be replicated from any dn which 
> has the same block replication, the ec block have to be replicated from the 
> decommissioning dn.
> The configurations dfs.namenode.replication.max-streams and 
> dfs.namenode.replication.max-streams-hard-limit will limit the replication 
> speed, but increase these configurations will create risk to the whole 
> cluster's network. So it should add a new configuration to limit the 
> decommissioning dn, distinguished from the cluster wide max-streams limit.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-16613) EC: Improve performance of decommissioning dn with many ec blocks

2022-11-16 Thread caozhiqiang (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16613?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17634860#comment-17634860
 ] 

caozhiqiang commented on HDFS-16613:


[~tasanuma], Yes, with this change, the 
dfs.namenode.replication.max-streams-hard-limit configuration will only affect 
decommissioning DataNode, but will not distinguish between replication blocks 
and EC blocks. If you consider this configuration should only effect EC blocks, 
we can change code in DatanodeManager like below:

 
{code:java}
int maxReplicaTransfers = blockManager.getMaxReplicationStreams() - 
xmitsInProgress;;
int maxEcTransfers;
 if (nodeinfo.isDecommissionInProgress()) {   
   maxEcTransfers = blockManager.getReplicationStreamsHardLimit()
  - xmitsInProgress;
} else {   
   maxEcTransfers = blockManager.getMaxReplicationStreams()
  - xmitsInProgress;
}
int numReplicationTasks = (int) Math.ceil(
(double) (totalReplicateBlocks * maxReplicaTransfers) / totalBlocks);
int numECTasks = (int) Math.ceil(
(double) (totalECBlocks * maxEcTransfers) / totalBlocks); {code}
 

 

> EC: Improve performance of decommissioning dn with many ec blocks
> -
>
> Key: HDFS-16613
> URL: https://issues.apache.org/jira/browse/HDFS-16613
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: ec, erasure-coding, namenode
>Affects Versions: 3.4.0
>Reporter: caozhiqiang
>Assignee: caozhiqiang
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0
>
> Attachments: image-2022-06-07-11-46-42-389.png, 
> image-2022-06-07-17-42-16-075.png, image-2022-06-07-17-45-45-316.png, 
> image-2022-06-07-17-51-04-876.png, image-2022-06-07-17-55-40-203.png, 
> image-2022-06-08-11-38-29-664.png, image-2022-06-08-11-41-11-127.png
>
>  Time Spent: 2.5h
>  Remaining Estimate: 0h
>
> In a hdfs cluster with a lot of EC blocks, decommission a dn is very slow. 
> The reason is unlike replication blocks can be replicated from any dn which 
> has the same block replication, the ec block have to be replicated from the 
> decommissioning dn.
> The configurations dfs.namenode.replication.max-streams and 
> dfs.namenode.replication.max-streams-hard-limit will limit the replication 
> speed, but increase these configurations will create risk to the whole 
> cluster's network. So it should add a new configuration to limit the 
> decommissioning dn, distinguished from the cluster wide max-streams limit.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-16613) EC: Improve performance of decommissioning dn with many ec blocks

2022-11-16 Thread Takanobu Asanuma (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16613?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17634749#comment-17634749
 ] 

Takanobu Asanuma commented on HDFS-16613:
-

We found that this change and increasing 
`dfs.namenode.replication.max-streams-hard-limit` have a side effect. Since it 
affects not only ec files but also replication files, even if DataNodes have 
only replication files, they will always generate high network traffic during 
decommissioning even though the decommissioning speed is not much improved. 
(Although decommissioning DataNodes with many replication files is already fast 
enough.)

> EC: Improve performance of decommissioning dn with many ec blocks
> -
>
> Key: HDFS-16613
> URL: https://issues.apache.org/jira/browse/HDFS-16613
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: ec, erasure-coding, namenode
>Affects Versions: 3.4.0
>Reporter: caozhiqiang
>Assignee: caozhiqiang
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0
>
> Attachments: image-2022-06-07-11-46-42-389.png, 
> image-2022-06-07-17-42-16-075.png, image-2022-06-07-17-45-45-316.png, 
> image-2022-06-07-17-51-04-876.png, image-2022-06-07-17-55-40-203.png, 
> image-2022-06-08-11-38-29-664.png, image-2022-06-08-11-41-11-127.png
>
>  Time Spent: 2.5h
>  Remaining Estimate: 0h
>
> In a hdfs cluster with a lot of EC blocks, decommission a dn is very slow. 
> The reason is unlike replication blocks can be replicated from any dn which 
> has the same block replication, the ec block have to be replicated from the 
> decommissioning dn.
> The configurations dfs.namenode.replication.max-streams and 
> dfs.namenode.replication.max-streams-hard-limit will limit the replication 
> speed, but increase these configurations will create risk to the whole 
> cluster's network. So it should add a new configuration to limit the 
> decommissioning dn, distinguished from the cluster wide max-streams limit.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-16613) EC: Improve performance of decommissioning dn with many ec blocks

2022-06-10 Thread caozhiqiang (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16613?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17553020#comment-17553020
 ] 

caozhiqiang commented on HDFS-16613:


[~hadachi] , thank you. Could you help to review this PR [GitHub Pull Request 
#4398|https://github.com/apache/hadoop/pull/4398] if this approach works?

> EC: Improve performance of decommissioning dn with many ec blocks
> -
>
> Key: HDFS-16613
> URL: https://issues.apache.org/jira/browse/HDFS-16613
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: ec, erasure-coding, namenode
>Affects Versions: 3.4.0
>Reporter: caozhiqiang
>Assignee: caozhiqiang
>Priority: Major
>  Labels: pull-request-available
> Attachments: image-2022-06-07-11-46-42-389.png, 
> image-2022-06-07-17-42-16-075.png, image-2022-06-07-17-45-45-316.png, 
> image-2022-06-07-17-51-04-876.png, image-2022-06-07-17-55-40-203.png, 
> image-2022-06-08-11-38-29-664.png, image-2022-06-08-11-41-11-127.png
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> In a hdfs cluster with a lot of EC blocks, decommission a dn is very slow. 
> The reason is unlike replication blocks can be replicated from any dn which 
> has the same block replication, the ec block have to be replicated from the 
> decommissioning dn.
> The configurations dfs.namenode.replication.max-streams and 
> dfs.namenode.replication.max-streams-hard-limit will limit the replication 
> speed, but increase these configurations will create risk to the whole 
> cluster's network. So it should add a new configuration to limit the 
> decommissioning dn, distinguished from the cluster wide max-streams limit.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-16613) EC: Improve performance of decommissioning dn with many ec blocks

2022-06-10 Thread Hiroyuki Adachi (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16613?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17552722#comment-17552722
 ] 

Hiroyuki Adachi commented on HDFS-16613:


[~caozhiqiang] , thank you for explaining in detail.

I think the data process you described is correct and your approach for 
improving performance is right. My concern was the reconstruction load on a 
large cluster where blocksToProcess is much larger than maxTransfers. But I 
found that I had misunderstood that the blocks held by the busy node would be 
reconstructed rather than replicated. So I think there is no problem using 
dfs.namenode.replication.max-streams-hard-limit for this purpose.

> EC: Improve performance of decommissioning dn with many ec blocks
> -
>
> Key: HDFS-16613
> URL: https://issues.apache.org/jira/browse/HDFS-16613
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: ec, erasure-coding, namenode
>Affects Versions: 3.4.0
>Reporter: caozhiqiang
>Assignee: caozhiqiang
>Priority: Major
>  Labels: pull-request-available
> Attachments: image-2022-06-07-11-46-42-389.png, 
> image-2022-06-07-17-42-16-075.png, image-2022-06-07-17-45-45-316.png, 
> image-2022-06-07-17-51-04-876.png, image-2022-06-07-17-55-40-203.png, 
> image-2022-06-08-11-38-29-664.png, image-2022-06-08-11-41-11-127.png
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> In a hdfs cluster with a lot of EC blocks, decommission a dn is very slow. 
> The reason is unlike replication blocks can be replicated from any dn which 
> has the same block replication, the ec block have to be replicated from the 
> decommissioning dn.
> The configurations dfs.namenode.replication.max-streams and 
> dfs.namenode.replication.max-streams-hard-limit will limit the replication 
> speed, but increase these configurations will create risk to the whole 
> cluster's network. So it should add a new configuration to limit the 
> decommissioning dn, distinguished from the cluster wide max-streams limit.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-16613) EC: Improve performance of decommissioning dn with many ec blocks

2022-06-07 Thread caozhiqiang (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16613?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17551381#comment-17551381
 ] 

caozhiqiang commented on HDFS-16613:


[~hadachi] , in my cluster, 
dfs.namenode.replication.max-streams-hard-limit=512, 
dfs.namenode.replication.work.multiplier.per.iteration=20.

The data process is below:
 # Choose the blocks to be reconstructed from neededReconstruction. This 
process use dfs.namenode.replication.work.multiplier.per.iteration to limit 
process number.
 # *Choose source datanode. This process use 
dfs.namenode.replication.max-streams-hard-limit to limit process number.*
 # Choose target datanode.
 # Add task to datanode.
 # The blocks to be replicated would put to pendingReconstruction. If blocks in 
pendingReconstruction timeout, they will be put back to neededReconstruction 
and continue process. *This process use 
dfs.namenode.reconstruction.pending.timeout-sec to limit time interval.*
 # *Send cmd to dn in heartbeat response. Use 
dfs.namenode.decommission.max-streams to limit task number original.*

Firstly, the process 1 doesn't have performance bottleneck.

Performance bottleneck is in process 2, 5 and 6. So we should increase the 
value of dfs.namenode.replication.max-streams-hard-limit and decrease the value 
of dfs.namenode.reconstruction.pending.timeout-sec{*}.{*} With process 6, we 
should use dfs.namenode.replication.max-streams-hard-limit to limit the task 
number.

 
{code:java}
// DatanodeManager::handleHeartbeat
      if (nodeinfo.isDecommissionInProgress()) {
        maxTransfers = blockManager.getReplicationStreamsHardLimit()
            - xmitsInProgress;
      } else {
        maxTransfers = blockManager.getMaxReplicationStreams()
            - xmitsInProgress;
      } {code}
The below graph with under replicated blocks and pending replicated blocks 
metrics monitor, which can show the performance bottleneck. A lot of blocks 
time out in pendingReconstruction and were put back to neededReconstruction 
repeatedly. The first graph is before optimization and the second is after 
optimization.

Please help to check this process, thank you.

 

!image-2022-06-08-11-41-11-127.png|width=932,height=190!

!image-2022-06-08-11-38-29-664.png|width=931,height=175!

> EC: Improve performance of decommissioning dn with many ec blocks
> -
>
> Key: HDFS-16613
> URL: https://issues.apache.org/jira/browse/HDFS-16613
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: ec, erasure-coding, namenode
>Affects Versions: 3.4.0
>Reporter: caozhiqiang
>Assignee: caozhiqiang
>Priority: Major
>  Labels: pull-request-available
> Attachments: image-2022-06-07-11-46-42-389.png, 
> image-2022-06-07-17-42-16-075.png, image-2022-06-07-17-45-45-316.png, 
> image-2022-06-07-17-51-04-876.png, image-2022-06-07-17-55-40-203.png, 
> image-2022-06-08-11-38-29-664.png, image-2022-06-08-11-41-11-127.png
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> In a hdfs cluster with a lot of EC blocks, decommission a dn is very slow. 
> The reason is unlike replication blocks can be replicated from any dn which 
> has the same block replication, the ec block have to be replicated from the 
> decommissioning dn.
> The configurations dfs.namenode.replication.max-streams and 
> dfs.namenode.replication.max-streams-hard-limit will limit the replication 
> speed, but increase these configurations will create risk to the whole 
> cluster's network. So it should add a new configuration to limit the 
> decommissioning dn, distinguished from the cluster wide max-streams limit.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-16613) EC: Improve performance of decommissioning dn with many ec blocks

2022-06-07 Thread Hiroyuki Adachi (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16613?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17550940#comment-17550940
 ] 

Hiroyuki Adachi commented on HDFS-16613:


[~caozhiqiang] , thank you for your explanation. It looks good.

Now I understand that the blocksToProcess controls the number of replication 
works, so if it is less than dfs.namenode.replication.max-streams-hard-limit, 
all blocks use replication on decommissioning node but not reconstruction.

Could you please tell me the value of 
dfs.namenode.replication.max-streams-hard-limit and 
dfs.namenode.replication.work.multiplier.per.iteration?

 
{code:java}
// BlockManager#computeDatanodeWork

final int blocksToProcess = numlive
* this.blocksReplWorkMultiplier;
final int nodesToProcess = (int) Math.ceil(numlive
* this.blocksInvalidateWorkPct);

int workFound = this.computeBlockReconstructionWork(blocksToProcess); {code}
 

> EC: Improve performance of decommissioning dn with many ec blocks
> -
>
> Key: HDFS-16613
> URL: https://issues.apache.org/jira/browse/HDFS-16613
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: ec, erasure-coding, namenode
>Affects Versions: 3.4.0
>Reporter: caozhiqiang
>Assignee: caozhiqiang
>Priority: Major
>  Labels: pull-request-available
> Attachments: image-2022-06-07-11-46-42-389.png, 
> image-2022-06-07-17-42-16-075.png, image-2022-06-07-17-45-45-316.png, 
> image-2022-06-07-17-51-04-876.png, image-2022-06-07-17-55-40-203.png
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> In a hdfs cluster with a lot of EC blocks, decommission a dn is very slow. 
> The reason is unlike replication blocks can be replicated from any dn which 
> has the same block replication, the ec block have to be replicated from the 
> decommissioning dn.
> The configurations dfs.namenode.replication.max-streams and 
> dfs.namenode.replication.max-streams-hard-limit will limit the replication 
> speed, but increase these configurations will create risk to the whole 
> cluster's network. So it should add a new configuration to limit the 
> decommissioning dn, distinguished from the cluster wide max-streams limit.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-16613) EC: Improve performance of decommissioning dn with many ec blocks

2022-06-07 Thread caozhiqiang (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16613?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17550907#comment-17550907
 ] 

caozhiqiang commented on HDFS-16613:


[~hadachi] , thank you for your review.

Firstly, my hadoop branch has included HDFS-14768. In my test, even the 
decommissioning node is made busy, ec blocks will not be reconstructed. It 
would not send ec task to datanode and only be reserved in 
BlockManager::pendingReconstruction. After timeout, these blocks will be put 
back to BlockManager::neededReconstruction and be rescheduled next time. So all 
blocks use replication on decommissioning node but not reconstruction. By the 
way, I decommission only one dn at a time.

Secondly, there are 12 datanodes in my cluster, and each dn has 12 disks. There 
are 27217 ec block groups in my cluster and about 2 blocks in one datanode. 
Other nodes' load are very low beside the decommissioning node, include load 
average, cpu iowait and network.

!image-2022-06-07-17-55-40-203.png|width=772,height=192!

!image-2022-06-07-17-45-45-316.png|width=772,height=198!

!image-2022-06-07-17-51-04-876.png|width=769,height=256!

> EC: Improve performance of decommissioning dn with many ec blocks
> -
>
> Key: HDFS-16613
> URL: https://issues.apache.org/jira/browse/HDFS-16613
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: ec, erasure-coding, namenode
>Affects Versions: 3.4.0
>Reporter: caozhiqiang
>Assignee: caozhiqiang
>Priority: Major
>  Labels: pull-request-available
> Attachments: image-2022-06-07-11-46-42-389.png, 
> image-2022-06-07-17-42-16-075.png, image-2022-06-07-17-45-45-316.png, 
> image-2022-06-07-17-51-04-876.png, image-2022-06-07-17-55-40-203.png
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> In a hdfs cluster with a lot of EC blocks, decommission a dn is very slow. 
> The reason is unlike replication blocks can be replicated from any dn which 
> has the same block replication, the ec block have to be replicated from the 
> decommissioning dn.
> The configurations dfs.namenode.replication.max-streams and 
> dfs.namenode.replication.max-streams-hard-limit will limit the replication 
> speed, but increase these configurations will create risk to the whole 
> cluster's network. So it should add a new configuration to limit the 
> decommissioning dn, distinguished from the cluster wide max-streams limit.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-16613) EC: Improve performance of decommissioning dn with many ec blocks

2022-06-07 Thread Hiroyuki Adachi (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16613?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17550845#comment-17550845
 ] 

Hiroyuki Adachi commented on HDFS-16613:


Thank you for your share.

How many EC blocks does the decommissioning datanode have and how many 
datanodes in your cluster?

I'm also interested in the load (network traffic, disk I/O, etc.) of the other 
datanodes while decommissioning. As I mentioned above, I think the other 
datanodes' loads were higher due to the reconstruction tasks. Was there any 
impact?

> EC: Improve performance of decommissioning dn with many ec blocks
> -
>
> Key: HDFS-16613
> URL: https://issues.apache.org/jira/browse/HDFS-16613
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: ec, erasure-coding, namenode
>Affects Versions: 3.4.0
>Reporter: caozhiqiang
>Assignee: caozhiqiang
>Priority: Major
>  Labels: pull-request-available
> Attachments: image-2022-06-07-11-46-42-389.png
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> In a hdfs cluster with a lot of EC blocks, decommission a dn is very slow. 
> The reason is unlike replication blocks can be replicated from any dn which 
> has the same block replication, the ec block have to be replicated from the 
> decommissioning dn.
> The configurations dfs.namenode.replication.max-streams and 
> dfs.namenode.replication.max-streams-hard-limit will limit the replication 
> speed, but increase these configurations will create risk to the whole 
> cluster's network. So it should add a new configuration to limit the 
> decommissioning dn, distinguished from the cluster wide max-streams limit.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-16613) EC: Improve performance of decommissioning dn with many ec blocks

2022-06-06 Thread caozhiqiang (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16613?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17550782#comment-17550782
 ] 

caozhiqiang commented on HDFS-16613:


In my cluster tests, the following optimizations would maximize the IO 
performance of the decommissioning DN. And the time spend by decommissioning a 
DN reduced from 3 hours to half an hour.
 # Add this patch
 # Increase the value of dfs.namenode.replication.max-streams-hard-limit
 # Decrease the value of dfs.namenode.reconstruction.pending.timeout-sec to 
shorten the time interval for checking pendingReconstructions.

> EC: Improve performance of decommissioning dn with many ec blocks
> -
>
> Key: HDFS-16613
> URL: https://issues.apache.org/jira/browse/HDFS-16613
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: ec, erasure-coding, namenode
>Affects Versions: 3.4.0
>Reporter: caozhiqiang
>Assignee: caozhiqiang
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> In a hdfs cluster with a lot of EC blocks, decommission a dn is very slow. 
> The reason is unlike replication blocks can be replicated from any dn which 
> has the same block replication, the ec block have to be replicated from the 
> decommissioning dn.
> The configurations dfs.namenode.replication.max-streams and 
> dfs.namenode.replication.max-streams-hard-limit will limit the replication 
> speed, but increase these configurations will create risk to the whole 
> cluster's network. So it should add a new configuration to limit the 
> decommissioning dn, distinguished from the cluster wide max-streams limit.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-16613) EC: Improve performance of decommissioning dn with many ec blocks

2022-06-06 Thread Hiroyuki Adachi (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16613?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17550779#comment-17550779
 ] 

Hiroyuki Adachi commented on HDFS-16613:


Hi [~caozhiqiang] ,

Using dfs.namenode.replication.max-streams-hard-limit is simple, but in my 
understanding, it makes the decommissioning node busy, and most of the EC 
blocks will not be replicated but reconstructed (see HDFS-14768). Since 
reconstruction is expensive, HDFS-8786 makes using replication for EC blocks on 
decommissioning node. Some people may prefer this.
What do you think of this?

> EC: Improve performance of decommissioning dn with many ec blocks
> -
>
> Key: HDFS-16613
> URL: https://issues.apache.org/jira/browse/HDFS-16613
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: ec, erasure-coding, namenode
>Affects Versions: 3.4.0
>Reporter: caozhiqiang
>Assignee: caozhiqiang
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> In a hdfs cluster with a lot of EC blocks, decommission a dn is very slow. 
> The reason is unlike replication blocks can be replicated from any dn which 
> has the same block replication, the ec block have to be replicated from the 
> decommissioning dn.
> The configurations dfs.namenode.replication.max-streams and 
> dfs.namenode.replication.max-streams-hard-limit will limit the replication 
> speed, but increase these configurations will create risk to the whole 
> cluster's network. So it should add a new configuration to limit the 
> decommissioning dn, distinguished from the cluster wide max-streams limit.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-16613) EC: Improve performance of decommissioning dn with many ec blocks

2022-06-03 Thread caozhiqiang (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16613?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17545904#comment-17545904
 ] 

caozhiqiang commented on HDFS-16613:


[~tasanuma] [~hadachi] , besides add a new configuration to limit 
decommissioning dn separately, we also can use 
dfs.namenode.replication.max-streams-hard-limit to impelements the same 
purpose. We only need to modify DatanodeManager::handleHeartbeat() and use 
dfs.namenode.replication.max-streams-hard-limit to give numReplicationTasks to 
decommissioning dn. I will create a new pr, please help to review.
{code:java}
      int maxTransfers;
      if (nodeinfo.isDecommissionInProgress()) {
        maxTransfers = blockManager.getReplicationStreamsHardLimit()
            - xmitsInProgress;
      } else {
        maxTransfers = blockManager.getMaxReplicationStreams()
            - xmitsInProgress;
      } {code}

> EC: Improve performance of decommissioning dn with many ec blocks
> -
>
> Key: HDFS-16613
> URL: https://issues.apache.org/jira/browse/HDFS-16613
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: ec, erasure-coding, namenode
>Affects Versions: 3.4.0
>Reporter: caozhiqiang
>Assignee: caozhiqiang
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> In a hdfs cluster with a lot of EC blocks, decommission a dn is very slow. 
> The reason is unlike replication blocks can be replicated from any dn which 
> has the same block replication, the ec block have to be replicated from the 
> decommissioning dn.
> The configurations dfs.namenode.replication.max-streams and 
> dfs.namenode.replication.max-streams-hard-limit will limit the replication 
> speed, but increase these configurations will create risk to the whole 
> cluster's network. So it should add a new configuration to limit the 
> decommissioning dn, distinguished from the cluster wide max-streams limit.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-16613) EC: Improve performance of decommissioning dn with many ec blocks

2022-05-31 Thread Takanobu Asanuma (Jira)


[ 
https://issues.apache.org/jira/browse/HDFS-16613?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17544658#comment-17544658
 ] 

Takanobu Asanuma commented on HDFS-16613:
-

Nice catch. We have also seen this problem. CC: [~hadachi] 

> EC: Improve performance of decommissioning dn with many ec blocks
> -
>
> Key: HDFS-16613
> URL: https://issues.apache.org/jira/browse/HDFS-16613
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: ec, erasure-coding, namenode
>Affects Versions: 3.4.0
>Reporter: caozhiqiang
>Assignee: caozhiqiang
>Priority: Major
>
> In a hdfs cluster with a lot of EC blocks, decommission a dn is very slow. 
> The reason is unlike replication blocks can be replicated from any dn which 
> has the same block replication, the ec block have to be replicated from the 
> decommissioning dn.
> The configurations dfs.namenode.replication.max-streams and 
> dfs.namenode.replication.max-streams-hard-limit will limit the replication 
> speed, but increase these configurations will create risk to the whole 
> cluster's network. So it should add a new configuration to limit the 
> decommissioning dn, distinguished from the cluster wide max-streams limit.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org