sodonnel commented on code in PR #4006:
URL: https://github.com/apache/ozone/pull/4006#discussion_r1041499482
##########
hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/SCMCommonPlacementPolicy.java:
##########
@@ -426,4 +453,47 @@ public boolean isValidNode(DatanodeDetails datanodeDetails,
}
return false;
}
+
+ /**
+ * Given a set of replicas of a container which are
+ * neither over underreplicated nor overreplicated,
+ * return a set of replicas to copy to another node to fix misreplication.
+ * @param replicas
+ */
+ @Override
+ public Set<ContainerReplica> replicasToCopyToFixMisreplication(
+ Set<ContainerReplica> replicas) {
+ Map<Node, List<ContainerReplica>> placementGroupReplicaIdMap
+ = replicas.stream().collect(Collectors.groupingBy(replica ->
+ this.getPlacementGroup(replica.getDatanodeDetails())));
+
+ int totalNumberOfReplicas = replicas.size();
+ int requiredNumberOfPlacementGroups =
+ getRequiredRackCount(totalNumberOfReplicas);
+ Set<ContainerReplica> copyReplicaSet = Sets.newHashSet();
+ List<List<ContainerReplica>> replicaSet = placementGroupReplicaIdMap
+ .values().stream()
+ .sorted((o1, o2) -> Integer.compare(o2.size(), o1.size()))
+ .collect(Collectors.toList());
+ for (List<ContainerReplica> replicaList: replicaSet) {
+ int maxReplicasPerPlacementGroup = getMaxReplicasPerRack(
+ totalNumberOfReplicas, requiredNumberOfPlacementGroups);
+ int numberOfReplicasToBeCopied = Math.max(0,
Review Comment:
I **think** it should always minus the maxReplicasPerPlacementGroup, even if
the rack does not have as many as that. However I am not 100% certain.
My logic here, is if the biggest rack does not have
maxReplicasPerPlacementGroup, then we must either have another rack than we
need, or some other rack has more than the ideal number. If we don't subtract
maxReplicasPerPlacementGroup, then we may fail to move one from a rack which is
larger than it should be.
In practice it may work with either. I cannot come up with an example where
it would break though.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]