xBis7 commented on code in PR #5726:
URL: https://github.com/apache/ozone/pull/5726#discussion_r1427902450
##########
hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/SCMCommonPlacementPolicy.java:
##########
@@ -445,6 +445,7 @@ public ContainerPlacementStatus validateContainerPlacement(
}
}
List<Integer> currentRackCount = new ArrayList<>(dns.stream()
+ .filter(d -> !(d.isDecommissioned()))
Review Comment:
@sodonnel I've been looking into your suggestion. With that change, we will
be ignoring mis-replication for every over-replicated container.
Let's say we have this scenario
* RackAware policy
* 7 datanodes, 3 are IN_SERVICE, 4 are offline
* 2 of the offline nodes get back IN_SERVICE
* Now we have, 5 nodes IN_SERVICE, 2 offline
* 5 available replicas and we only need 3, so the container is
over-replicated
Initially
```
/rack0/node0 -> in_service - used
/rack0/node1 -> in_service - used
/rack0/node2 -> offline
/rack0/node3 -> offline
/rack0/node4 -> offline
/rack1/node5 -> in_service - used
/rack1/node6 -> offline
```
After 2 offline coming back
```
/rack0/node0 -> in_service - used
/rack0/node1 -> in_service - used
/rack0/node2 -> offline -> in_service - used
/rack0/node3 -> offline
/rack0/node4 -> offline
/rack1/node5 -> in_service - used
/rack1/node6 -> offline -> in_service - used
```
The placement will be [3, 2] where we need [2, 1]. The container is
mis-replicated but because we adjust the maxReplica number by adding the
replicaDelta, we get that the policy is satisfied.
`maxReplicaPerRack = 2`
`dns.size() = 7`
`replicas = 3`
`Math.max(0, dns.size() - replicas) = Math.max(0, 4) = 4` regardless of
whether the excessive replicas are healthy or not.
`maxReplicaPerRack = 6`
Even if we add a check to increase the maxReplica number, only if there are
decommission or maintenance nodes, it won't make a difference. We can't
distinguish between replicas on offline nodes and replicas on online nodes.
This approach will effectively ignore `mis-replication` when caused by
`over-replication`.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]