[GitHub] [ozone] swamirishi commented on a diff in pull request #4006: HDDS-7492. Placement Policy Interface changes to handle misreplication changes

GitBox Mon, 28 Nov 2022 11:21:35 -0800


swamirishi commented on code in PR #4006:
URL: https://github.com/apache/ozone/pull/4006#discussion_r1033960713



##########
hadoop-hdds/common/src/main/java/org/apache/hadoop/hdds/scm/PlacementPolicy.java:
##########
@@ -22,12 +22,14 @@
 import java.io.IOException;
 import java.util.Collections;
 import java.util.List;
+import java.util.Map;
+import java.util.Set;
 
 /**
  * A PlacementPolicy support choosing datanodes to build
  * pipelines or containers with specified constraints.
  */
-public interface PlacementPolicy {
+public interface PlacementPolicy<Replica, PlacementGroup> {

Review Comment:
   > Taking the 'replicasToCopy` to start with, as it is easier.
   > 
   > What is passed into this method should be neither over or under 
replicated, the only potential problem is mis-replication.
   > 
   > For Random and Capacity placement - neither of these can be mis-replicated 
as they don't care about Racks.
   > 
   > For RackScatter, the policy says we should spread the replicas across as 
many racks as possible.
   > 
   > This means we have:
   > 
   > ```
   > expectedRacks = Math.min(Total-Racks-In-Cluster, replica-count)
   > 
   > maxReplicasPerRack = Math.ceil(replica-count / expectedRacks)
   > ```
   > 
   > If we group the replicas into something like:
   > 
   > ```
   >     rack -> List<ContainerReplica>
   > ```
   > 
   > Then any rack with more than maxReplicasPerRack in its list needs to have 
some copied elsewhere.
   > 
   > It doesn't matter which we copy, as each replica is unique when there is 
no over or under replication.
   > 
   > For RackScatter, after you group the containers per rack, I think the 
logic comes down to:
   > 
   > ```
   > List<ContainerReplica> toCopy = new ArrayList<>();
   > for (List<ContainerReplica> repList : rackToReplicas.values()( {
   >   if (repList.size() > replicasPerRack) {
   >     // If there are not too many on the rack, we don't need to
   >     // worry about this rack, so just check the next one.
   >     continue;
   >   }
   >   // There are too many replicas on this rack, so we need to remove
   >   // enough of them to get to maxReplicasPerRack:
   >   for (int i=0; i < repList.size() - replicasPerRack; i++) {
   >     toCopy = toCopy.add(repList.get[i];
   >   }
   > }
   > return repList;
   > ```
   > 
   > For the RackAware placement policy, the policy says, the replicas must be 
on at least 2 racks, so I think we can just say:
   > 
   > expectedRacks = Math.min(Total-Racks-In-Cluster, Math.min(replica-count, 
2)) // Handle replica=ONE containers
   > 
   > And then the logic is exactly the same.
   > 
   > In the current implementation, there seems to be a lot more groupings than 
above, and I am not sure if they are needed?
   > 
   > Also, I know it was me who mentioned the parameters `int 
expectedCountPerUniqueReplicas, int expectedUniqueGroups`, but do we actually 
need them to identify replicas to copy in these two placement policies? Would 
it be more sensible to pass ReplicationConfig instead, where we can imply the 
groups etc if needed?
   
   Should we add expectedCountPerUniqueReplicas & expectedUniqueGroups to 
replicationConfig interface? I don't concur with Placement Policy trying to 
understand the working of EC & Ratis.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [ozone] swamirishi commented on a diff in pull request #4006: HDDS-7492. Placement Policy Interface changes to handle misreplication changes

Reply via email to