sodonnel commented on code in PR #4014:
URL: https://github.com/apache/ozone/pull/4014#discussion_r1063844308


##########
hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/SCMCommonPlacementPolicy.java:
##########
@@ -503,4 +508,100 @@ public Set<ContainerReplica> 
replicasToCopyToFixMisreplication(
   protected Node getPlacementGroup(DatanodeDetails dn) {
     return nodeManager.getClusterNetworkTopologyMap().getAncestor(dn, 1);
   }
+
+  /**
+   * Given a set of replicas, expectedCount for Each replica,
+   * number of unique replica indexes. Replicas to be deleted for fixing over
+   * replication is computed.
+   * The algorithm starts with creating a replicaIdMap which contains the
+   * replicas grouped by replica Index. A placementGroup Map is created which
+   * groups replicas based on their rack & the replicas within the rack
+   * are further grouped based on the replica Index.
+   * A placement Group Count Map is created which keeps
+   * track of the count of replicas in each rack.
+   * We iterate through overreplicated replica indexes sorted in descending
+   * order based on their current replication factor in a descending factor.
+   * For each replica Index the replica is removed from the rack which contains
+   * the most replicas, in order to achieve this the racks are put
+   * into priority queue & are based on the number of replicas they have.
+   * The replica is removed from the rack with maximum replicas & the replica
+   * to be removed is also removed from the maps created above &
+   * the count for rack is reduced.
+   * The set of replicas computed are then returned by the function.
+   * @param replicas: Map of replicas with value signifying if
+   *                  replica can be copied
+   * @param expectedCountPerUniqueReplica
+   * @return Set of replicas to be removed are computed.
+   */
+  @Override
+  public Set<ContainerReplica> replicasToRemoveToFixOverreplication(
+          Set<ContainerReplica> replicas, int expectedCountPerUniqueReplica) {
+    Map<Integer, Set<ContainerReplica>> replicaIdMap = new HashMap<>();
+    Map<Node, Map<Integer, Set<ContainerReplica>>> placementGroupReplicaIdMap
+            = new HashMap<>();
+    Map<Node, Integer> placementGroupCntMap = new HashMap<>();
+    for (ContainerReplica replica:replicas) {
+      Integer replicaId = replica.getReplicaIndex();
+      Node placementGroup = getPlacementGroup(replica.getDatanodeDetails());
+      if (!replicaIdMap.containsKey(replicaId)) {

Review Comment:
   I think this can be simplified to:
   
   ```
   replicaIdMap.computeIfAbsent(replicaID, k -> Sets.newHashSet()).add(replica);
   ```
   
   And then line 554 can go away too.
   
   Similar for the one below.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to