sodonnel commented on code in PR #4006:
URL: https://github.com/apache/ozone/pull/4006#discussion_r1040179523
##########
hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/SCMCommonPlacementPolicy.java:
##########
@@ -426,4 +451,67 @@ public boolean isValidNode(DatanodeDetails datanodeDetails,
}
return false;
}
+
+ /**
+ * Given a set of replicas of a container which are
+ * neither over underreplicated nor overreplicated,
+ * return a set of replicas to copy to another node to fix misreplication.
+ * @param replicas
+ */
+ @Override
+ public Set<ContainerReplica> replicasToCopyToFixMisreplication(
+ Set<ContainerReplica> replicas) {
+ Map<Node, List<ContainerReplica>> placementGroupReplicaIdMap
+ = replicas.stream().collect(Collectors.groupingBy(replica ->
+ this.getPlacementGroup(replica.getDatanodeDetails())));
+
+ int totalNumberOfReplicas = replicas.size();
+ int requiredNumberOfPlacementGroups =
+ getRequiredRackCount(totalNumberOfReplicas);
+ int additionalNumberOfRacksRequired = Math.max(
+ requiredNumberOfPlacementGroups -
placementGroupReplicaIdMap.size(),
+ 0);
+ int replicasPerPlacementGroup =
+ getMaxReplicasPerRack(totalNumberOfReplicas);
+ Set<ContainerReplica> copyReplicaSet = Sets.newHashSet();
+
+ for (List<ContainerReplica> replicaList: placementGroupReplicaIdMap
+ .values()) {
+ if (replicaList.size() > replicasPerPlacementGroup) {
+ List<ContainerReplica> replicasToBeCopied = replicaList.stream()
+ .limit(replicaList.size() - replicasPerPlacementGroup)
+ .collect(Collectors.toList());
+ copyReplicaSet.addAll(replicasToBeCopied);
+ replicaList.removeAll(replicasToBeCopied);
+ }
+ }
+ if (additionalNumberOfRacksRequired > copyReplicaSet.size()) {
Review Comment:
Would this algorithm work:
```
int replicaCount = replicas.length;
int remainingRacks = requiredRacks;
while (orderedRacks.size() > 1 && remainingRacks > 0) {
int maxOnRack = replicaCount / remainingRacks +
Math.min(replicaCount % remainingRacks, 1);
List<String> nodes = orderedRacks.poll();
if (nodes.size() > maxOnRack) {
// Copy those over maxPerRack.
replicaCount = replicaCount - maxOnRack;
} else {
// Nothing to remove from this rack
replicaCount = replicaCount - nodes.size();
}
remainingRacks -= 1;
}
```
Where orderedRacks is the queue you formed, basically we start a grouping of
Rack -> List<ContainerReplica> and the number of replicas and the number of
racks we require.
Then we check if the biggest rack has over the max it should have, if it
does, add the excess to copy.
Then we reduce the replica count by what will remain on this largest rack,
and reduce the racks by 1, and do the calculation again to find the next max
per rack, and so on. We are sort of recursively peeling off the largest rack
and fixing it and then checking the rest until we get it down to just 2 racks
or have met the required rack count.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]