[GitHub] [ozone] swamirishi commented on a diff in pull request #4006: HDDS-7492. Placement Policy Interface changes to handle misreplication changes

GitBox Mon, 05 Dec 2022 17:01:16 -0800


swamirishi commented on code in PR #4006:
URL: https://github.com/apache/ozone/pull/4006#discussion_r1040277970



##########
hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/SCMCommonPlacementPolicy.java:
##########
@@ -426,4 +451,67 @@ public boolean isValidNode(DatanodeDetails datanodeDetails,
     }
     return false;
   }
+
+  /**
+   * Given a set of replicas of a container which are
+   * neither over underreplicated nor overreplicated,
+   * return a set of replicas to copy to another node to fix misreplication.
+   * @param replicas
+   */
+  @Override
+  public Set<ContainerReplica> replicasToCopyToFixMisreplication(
+         Set<ContainerReplica> replicas) {
+    Map<Node, List<ContainerReplica>> placementGroupReplicaIdMap
+            = replicas.stream().collect(Collectors.groupingBy(replica ->
+            this.getPlacementGroup(replica.getDatanodeDetails())));
+
+    int totalNumberOfReplicas = replicas.size();
+    int requiredNumberOfPlacementGroups =
+            getRequiredRackCount(totalNumberOfReplicas);
+    int additionalNumberOfRacksRequired = Math.max(
+            requiredNumberOfPlacementGroups - 
placementGroupReplicaIdMap.size(),
+            0);
+    int replicasPerPlacementGroup =
+            getMaxReplicasPerRack(totalNumberOfReplicas);
+    Set<ContainerReplica> copyReplicaSet = Sets.newHashSet();
+
+    for (List<ContainerReplica> replicaList: placementGroupReplicaIdMap
+            .values()) {
+      if (replicaList.size() > replicasPerPlacementGroup) {
+        List<ContainerReplica> replicasToBeCopied = replicaList.stream()
+                .limit(replicaList.size() - replicasPerPlacementGroup)
+                .collect(Collectors.toList());
+        copyReplicaSet.addAll(replicasToBeCopied);
+        replicaList.removeAll(replicasToBeCopied);
+      }
+    }
+    if (additionalNumberOfRacksRequired > copyReplicaSet.size()) {

Review Comment:
   > If you think it would work, could you try implementing it with my 
algorithm and see how it looks? I think it will be less LOC and possibly easier 
to understand. On 5 Dec 2022, at 22:59, Swaminathan Balachandran ***@***.***> 
wrote: @swamirishi commented on this pull request. In 
hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/SCMCommonPlacementPolicy.java:
   > + 0);
   > + int replicasPerPlacementGroup = + 
getMaxReplicasPerRack(totalNumberOfReplicas); + Set<ContainerReplica> 
copyReplicaSet = Sets.newHashSet(); + + for (List<ContainerReplica> 
replicaList: placementGroupReplicaIdMap + .values()) { + if (replicaList.size() 
> replicasPerPlacementGroup) { + List<ContainerReplica> replicasToBeCopied = 
replicaList.stream() + .limit(replicaList.size() - replicasPerPlacementGroup) + 
.collect(Collectors.toList()); + copyReplicaSet.addAll(replicasToBeCopied); + 
replicaList.removeAll(replicasToBeCopied); + } + } + if 
(additionalNumberOfRacksRequired > copyReplicaSet.size()) { The algorithm you 
are suggesting would also work. —Reply to this email directly, view it on 
GitHub, or unsubscribe.You are receiving this because you were 
mentioned.Message ID: ***@***.***>
   @sodonnel 
   I thought about it, we might end up copying more replicas than required in 
certain cases. 
   When we are sorting on descending order & iterating the numerator is 
decreasing at a faster rate than denominator.
   For the case 5 replicas we won't be able to maintain the max number of 
replicas per rack. I am not sure if this would be the right approach and at the 
end we might end up copying more replicas than required overall.
   I am not able to come up with a test case.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [ozone] swamirishi commented on a diff in pull request #4006: HDDS-7492. Placement Policy Interface changes to handle misreplication changes

Reply via email to