swamirishi commented on code in PR #4006:
URL: https://github.com/apache/ozone/pull/4006#discussion_r1040187251
##########
hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/SCMCommonPlacementPolicy.java:
##########
@@ -426,4 +451,67 @@ public boolean isValidNode(DatanodeDetails datanodeDetails,
}
return false;
}
+
+ /**
+ * Given a set of replicas of a container which are
+ * neither over underreplicated nor overreplicated,
+ * return a set of replicas to copy to another node to fix misreplication.
+ * @param replicas
+ */
+ @Override
+ public Set<ContainerReplica> replicasToCopyToFixMisreplication(
+ Set<ContainerReplica> replicas) {
+ Map<Node, List<ContainerReplica>> placementGroupReplicaIdMap
+ = replicas.stream().collect(Collectors.groupingBy(replica ->
+ this.getPlacementGroup(replica.getDatanodeDetails())));
+
+ int totalNumberOfReplicas = replicas.size();
+ int requiredNumberOfPlacementGroups =
+ getRequiredRackCount(totalNumberOfReplicas);
+ int additionalNumberOfRacksRequired = Math.max(
+ requiredNumberOfPlacementGroups -
placementGroupReplicaIdMap.size(),
+ 0);
+ int replicasPerPlacementGroup =
+ getMaxReplicasPerRack(totalNumberOfReplicas);
+ Set<ContainerReplica> copyReplicaSet = Sets.newHashSet();
+
+ for (List<ContainerReplica> replicaList: placementGroupReplicaIdMap
+ .values()) {
+ if (replicaList.size() > replicasPerPlacementGroup) {
+ List<ContainerReplica> replicasToBeCopied = replicaList.stream()
+ .limit(replicaList.size() - replicasPerPlacementGroup)
+ .collect(Collectors.toList());
+ copyReplicaSet.addAll(replicasToBeCopied);
+ replicaList.removeAll(replicasToBeCopied);
+ }
+ }
+ if (additionalNumberOfRacksRequired > copyReplicaSet.size()) {
Review Comment:
This wouldn't work for the case say we have 5 replicas and 4 racks. Say
current placement has:
Rack 1: 2
Rack 2: 2
Rack 3: 1
In this case one of the replica from either Rack 1 or Rack 2 has to be
copied.
Max Replicas per rack is 2 in this case (5/4) + (5%4=1) = 2
So that is why I added another check for additionalNumberOfRacksRequired & I
am doing the same algorithm, but now comparing the rack with most replicas with
the rack with second highest number of replicas
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]