Wei-Chiu Chuang created HDDS-15014:
--------------------------------------
Summary: Speed up EC container decommission
Key: HDDS-15014
URL: https://issues.apache.org/jira/browse/HDDS-15014
Project: Apache Ozone
Issue Type: Epic
Components: SCM
Reporter: Wei-Chiu Chuang
When Ozone decommission a datanode, it must wait until all containers are fully
replicated to other datanodes in the cluster.
In the case of EC containers, instead of reconstructing the block, it
replicates to a destination. That replication becomes a bottleneck because it
doesn't matter how big the cluster is, the container block is always replicated
from the same source (the decommissioning datanode)
Solution: keep doing replication, but when there are too many containers
pending, do reconstruction instead. Reconstruction leverages other datanodes in
the cluster so it increases parallelism.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]