[ 
https://issues.apache.org/jira/browse/HDDS-7198?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17599172#comment-17599172
 ] 

Ethan Rose commented on HDDS-7198:
----------------------------------

cc [~sodonnell] as an SME on both decommissioning and erasure coding. When a 
datanode is decommissioning, are the EC replications done with offline recovery 
or container copy? If offline recovery is being used I don't think EC data will 
have this problem.

> Datanodes should avoid using decommissioning nodes as a container replication 
> source
> ------------------------------------------------------------------------------------
>
>                 Key: HDDS-7198
>                 URL: https://issues.apache.org/jira/browse/HDDS-7198
>             Project: Apache Ozone
>          Issue Type: Improvement
>          Components: Ozone Datanode, SCM
>            Reporter: Ethan Rose
>            Priority: Major
>
> Currently when SCM tells a target datanode to replicate a container, it sends 
> the target datanode an ordered list of source datanodes it should download 
> the container from. The target then shuffles the list and tries to download 
> from the sources in the resulting order one by one until one of them succeeds.
> In failure scenarios this works fine. The node that had the failure will not 
> be included in the source list, distributing the source replication load 
> throughout the cluster. However, when a datanode is decommissioning, it will 
> be included in the source list with no distinction from other replicas, 
> causing it to bear a disproportionate amount of the replication load.
> For example, if every container in the cluster has three replicas and one 
> datanode is being decommissioned, the decommissioning node will be the source 
> for 33% of the replications, while the other 66% will be distributed 
> throughout the cluster based on placement of the other container replicas. 
> With datanodes currently throttled at 10 concurrent replication requests, 
> this will place continuous load on the decommissioning node (which may 
> already be in a bad state hence why it is being removed), while decreasing 
> parallelization of the overall replications required.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to