sodonnel commented on code in PR #3962:
URL: https://github.com/apache/ozone/pull/3962#discussion_r1022701104


##########
hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/container/replication/health/ECReplicationCheckHandler.java:
##########
@@ -95,6 +96,24 @@ public boolean handle(ContainerCheckRequest request) {
   public ContainerHealthResult checkHealth(ContainerCheckRequest request) {
     ContainerInfo container = request.getContainerInfo();
     Set<ContainerReplica> replicas = request.getContainerReplicas();
+
+    /*
+    Remove UNHEALTHY replicas because they are unavailable. They could be a
+    reason for under replication but should not be a reason for over
+    replication.
+
+    For example, consider the following set of replicas for an EC 3-2 
container:
+    Replica Index 1: Closed
+    Replica Index 2: Closed
+    Replica Index 3: Closed, Unhealthy (2 replicas for this index)
+    Replica Index 4: Unhealthy
+    Replica Index 5: Closed
+
+    This is a case of under replication because index 4 is unavailable. Index
+    3 is not considered over replicated because its second copy is unhealthy.
+     */
+    replicas.removeIf(containerReplica -> containerReplica.getState() ==

Review Comment:
   I think there is a very subtle bug here. The replicas set comes from the 
ContainerCheckRequest object, and from it, you are removing the unhealthy 
replicas. Later, that same request will end up in in the unhealthy handler and 
it will find there are no unhealthy replicas, as we removed them here.
   
   I guess we need to create a copy of the set to modify it, but its too easy 
to fall into this trap and the way our tests are currently structured, testing 
each handler in isolation, we would never catch it.
   
   I think we need to also update the ContainerCheckRequest object to wrap all 
the sets in `Collections.UnmodifableSet(...)` and then this operation would get 
an error.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to