sodonnel commented on a change in pull request #881:
URL: https://github.com/apache/hadoop-ozone/pull/881#discussion_r420707400



##########
File path: 
hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/container/ReplicationManager.java
##########
@@ -512,25 +523,60 @@ private void handleUnderReplicatedContainer(final 
ContainerInfo container,
       if (source.size() > 0) {
         final int replicationFactor = container
             .getReplicationFactor().getNumber();
-        final int delta = replicationFactor - getReplicaCount(id, replicas);
+        // Want to check if the container is mis-replicated after considering
+        // inflight add and delete.
+        // Create a new list from source (healthy replicas minus pending 
delete)
+        List<DatanodeDetails> targetReplicas = new ArrayList<>(source);
+        // Then add any pending additions
+        targetReplicas.addAll(replicationInFlight);
+
+        int delta = replicationFactor - getReplicaCount(id, replicas);
+        final ContainerPlacementStatus placementStatus =
+            containerPlacement.validateContainerPlacement(
+                targetReplicas, replicationFactor);
+        final int misRepDelta = placementStatus.additionalReplicaRequired();

Review comment:
       > I understand the statement "a container has a valid placement", as "we 
have the replication factor amount of replicas placed as we want them to be 
placed"
   
   Placement and replication are two different things, and I think it is good 
to keep them that way.
   
   If you consider HDFS fsck - if a container is under replicated, it will 
report that. If it is mis-replicated it will report that. If it is both, it 
reports them both separately, along with their "missing count" for each. In 
HDFS for racks, there can only ever be 1 additional rack needed. But with 
upgrade domain, there can be possibly two additional domains needed.
   
   > I see this as something that still implicitly brings in rack awareness 
into ReplicationManager via an arbitrary int value
   
   It brings mis-replication into the RM, not rack awareness, and this is OK. A 
container needs to meet several goals to be properly replicated - the correct 
number of healthy replicas and the correct placement and this is what the 
replication manager checks.
   
   In Recon, we want to be able to give an indicator about mis-replication and 
under replication separately in a similar way and ideally with Recon we just 
load the exact same placement policy as the cluster is using and call the same 
methods.
   
   >  I am unhappy about the check in lines from L562, and suggested a way 
there if we can achieve it.
   
   This check is still useful. If we combined the two rules as you are 
suggesting so "addtitionalReplicaRequired" return a single number representing 
delta and misRep, then in the case where you have delta=1, misRep=1 (ie two 
replicas on one rack), then if the newly selected node, due to fallback, gives 
us 3 replicas on one rack. This is OK, but the next time this container is 
checked, we have 3 replicas on one rack, and the logic will return 1, telling 
us to create another. We need to be sure this improves the mis-replication 
before creating a 4th replica, as that will make the container over-replicated.
   
   Of course, we can check:
   
    * If we test 2 replicas with 1 rack, the logic will return 1.
    * If we test 3 replicas with 1 rack, the logic will return 1.
    * If we test 4 replicas with 1 rack, the logic will return 1.
   
   So if we push the under and mis-replication numbers together, we cannot 
really see what is going on, and we will need to test the replication count too 
I think anyway.
   
   I can give these methods new names or improve the java docs a bit more to 
make it clear that placement != replication and the number returned from the 
ContainerPlacementStatus object does not reflect replicas. I've made a small 
change to the doc now and will push it up:
   
   ```
     /**
      * Returns a boolean indicating if the container replicas meet the desired
      * placement policy. That is, they are placed on a sufficient number of
      * racks, or node groups etc. This does not check if the container is over
      * or under replicated, as it is possible for a container to have enough
      * replicas and still not meet the placement rules.
      * @return True if the containers meet the policy. False otherwise.
      */
     boolean isPolicySatisfied();
   
     /**
      * Returns an String describing why a container does not meet the placement
      * policy.
      * @return String indicating the reason the policy is not satisfied or 
null if
      *         the policy is satisfied.
      */
     String misReplicatedReason();
   
     /**
      * If the container does not meet the placement policy, return an integer
      * indicating how many additional replicas are required so the container
      * meets the placement policy. Otherwise return zero.
      * Note the count returned are the number of replicas needed to meet the
      * placement policy. The container may need additional replicas if it is
      * under replicated. The container could also have sufficient replicas but
      * require more to make it meet the policy, as the existing replicas are 
not
      * placed correctly.
      * @return The number of additional replicas required, or zero.
      */
     int additionalReplicaRequired();
   ```
   
   I could change the name of additionalReplicaRequired() to be something like 
replicasRequiredForPlacement(), if yo think it makes it more clear.
     




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to