swamirishi commented on code in PR #4006:
URL: https://github.com/apache/ozone/pull/4006#discussion_r1041292440
##########
hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/SCMCommonPlacementPolicy.java:
##########
@@ -426,4 +451,67 @@ public boolean isValidNode(DatanodeDetails datanodeDetails,
}
return false;
}
+
+ /**
+ * Given a set of replicas of a container which are
+ * neither over underreplicated nor overreplicated,
+ * return a set of replicas to copy to another node to fix misreplication.
+ * @param replicas
+ */
+ @Override
+ public Set<ContainerReplica> replicasToCopyToFixMisreplication(
+ Set<ContainerReplica> replicas) {
+ Map<Node, List<ContainerReplica>> placementGroupReplicaIdMap
+ = replicas.stream().collect(Collectors.groupingBy(replica ->
+ this.getPlacementGroup(replica.getDatanodeDetails())));
+
+ int totalNumberOfReplicas = replicas.size();
+ int requiredNumberOfPlacementGroups =
+ getRequiredRackCount(totalNumberOfReplicas);
+ int additionalNumberOfRacksRequired = Math.max(
+ requiredNumberOfPlacementGroups -
placementGroupReplicaIdMap.size(),
+ 0);
+ int replicasPerPlacementGroup =
+ getMaxReplicasPerRack(totalNumberOfReplicas);
+ Set<ContainerReplica> copyReplicaSet = Sets.newHashSet();
+
+ for (List<ContainerReplica> replicaList: placementGroupReplicaIdMap
+ .values()) {
+ if (replicaList.size() > replicasPerPlacementGroup) {
+ List<ContainerReplica> replicasToBeCopied = replicaList.stream()
+ .limit(replicaList.size() - replicasPerPlacementGroup)
+ .collect(Collectors.toList());
+ copyReplicaSet.addAll(replicasToBeCopied);
+ replicaList.removeAll(replicasToBeCopied);
+ }
+ }
+ if (additionalNumberOfRacksRequired > copyReplicaSet.size()) {
Review Comment:
@sodonnel
When there is a lot of skew in replicas per rack this might lead to that
kind of situation where maxReplicaPerRack would change significantly.
I came up with this case consider this is placement with total number of
replicas=13. I know this is bizarre case but consider this is the distribution.
5 Racks case: 5,4,2,1,1
In the first iteration:
Max Replicas per rack = 13/5 = 2.6 rounded up to 3
We will move 2 replicas from rack 1.
In the second iteration:
Max Replicas per rack = 8/4 = 2
We will move 2 replicas from rack 2
In the third iteration:
Max Replicas per rack = 4/3 = 1.33 rounded up to 2
No replicas would be moved from rack 3
In the 4th iteration:
Max Replicas per rack = 2/2 =. 1
No replicas would be moved from rack 4
In the 5th iteration:
Max Replicas per rack. = 1/1 = 1
No replicas would be moved from rack 5
Thus we are moving 4 replicas according to this 2 algorithm 2 from rack 1 &
2 from rack 2.
But overall requirement by the policy is to just have maximum 3 replicas per
rack = 13/5 = 2.6 rounded up to 3
We could have achieved this by removing 2 replicas from rack 1 & 1 replica
from rack 2. In this case we are just moving 3 replicas overall which is lesser
than 4. So this algorithm might over copy replicas as per my previous comment.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]