[
https://issues.apache.org/jira/browse/HDDS-8617?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Attila Doroszlai reassigned HDDS-8617:
--------------------------------------
Assignee: Attila Doroszlai
> Ratis underreplication due to maintenance is not deprioritised
> --------------------------------------------------------------
>
> Key: HDDS-8617
> URL: https://issues.apache.org/jira/browse/HDDS-8617
> Project: Apache Ozone
> Issue Type: Sub-task
> Components: SCM
> Affects Versions: 1.4.0
> Reporter: Attila Doroszlai
> Assignee: Attila Doroszlai
> Priority: Major
>
> According to the following javadoc, both decommission and maintenance
> replicas should be deprioritised:
> {code:title=https://github.com/apache/ozone/blob/6d9002201e58dc995dc133941acaef2af03cb9d2/hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/container/replication/ContainerHealthResult.java#L145-L164}
> /**
> * The weightedRedundancy, is the remaining redundancy + the requeue
> count.
> * When this value is used for ordering in a priority queue it ensures the
> * priority is reduced each time it is requeued, to prevent it from
> blocking
> * other containers from being processed.
> * Additionally, so that decommission and maintenance replicas are not
> * ordered ahead of under-replicated replicas, a redundancy of
> * DECOMMISSION_REDUNDANCY is used for the decommission redundancy rather
> * than its real redundancy.
> * @return The weightedRedundancy of this result.
> */
> public int getWeightedRedundancy() {
> int result = requeueCount;
> if (dueToDecommission) {
> result += DECOMMISSION_REDUNDANCY;
> } else {
> result += getRemainingRedundancy();
> }
> return result;
> }
> {code}
> but {{dueToDecommission=true}} is set only based on decommission replicas,
> ignoring maintenance replicas ({{maintenanceCount}}):
> {code:title=https://github.com/apache/ozone/blob/6d9002201e58dc995dc133941acaef2af03cb9d2/hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/container/replication/RatisContainerReplicaCount.java#L520-L533}
> /**
> * Checks whether insufficient replication is because of some replicas
> * being on datanodes that were decommissioned.
> * @param includePendingAdd if pending adds should be considered
> * @return true if there is insufficient replication and it's because of
> * decommissioning.
> */
> public boolean inSufficientDueToDecommission(boolean includePendingAdd) {
> if (isSufficientlyReplicated(includePendingAdd)) {
> return false;
> }
> int delta = redundancyDelta(true, includePendingAdd);
> return decommissionCount >= delta;
> }
> {code}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]