Ethan Rose created HDDS-7198:
--------------------------------
Summary: Datanodes should avoid using decommissioning nodes as a
container replication source
Key: HDDS-7198
URL: https://issues.apache.org/jira/browse/HDDS-7198
Project: Apache Ozone
Issue Type: Improvement
Components: Ozone Datanode, SCM
Reporter: Ethan Rose
Currently when SCM tells a target datanode to replicate a container, it sends
the target datanode an ordered list of source datanodes it should download the
container from. The target then shuffles the list and tries to download from
the sources in the resulting order one by one until one of them succeeds.
In failure scenarios this works fine. The node that had the failure will not be
included in the source list, distributing the source replication load
throughout the cluster. However, when a datanode is decommissioning, it will be
included in the source list with no distinction from other replicas, causing it
to bear a disproportionate amount of the replication load.
For example, if every container in the cluster has three replicas and one
datanode is being decommissioned, the decommissioning node will be the source
for 33% of the replications, while the other 66% will be distributed throughout
the cluster based on placement of the other container replicas. With datanodes
currently throttled at 10 concurrent replication requests, this will place
continuous load on the decommissioning node (which may already be in a bad
state hence why it is being removed), while decreasing parallelization of the
overall replications required.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]