sodonnel commented on code in PR #5651:
URL: https://github.com/apache/ozone/pull/5651#discussion_r1406275350


##########
hadoop-ozone/recon/src/main/java/org/apache/hadoop/ozone/recon/fsck/ContainerHealthStatus.java:
##########
@@ -48,8 +48,12 @@ public class ContainerHealthStatus {
     int repFactor = container.getReplicationConfig().getRequiredNodes();
     this.healthyReplicas = healthyReplicas
         .stream()
-        .filter(r -> !r.getState()
-            .equals((ContainerReplicaProto.State.UNHEALTHY)))
+        // Filter unhealthy replicas and
+        // replicas belonging to out-of-service nodes.
+        .filter(r ->
+            (!r.getDatanodeDetails().isDecommissioned() &&
+             !r.getDatanodeDetails().isMaintenance() &&

Review Comment:
   It is "OK" for a maintenance replica to be offline. The definition of 
maintenance is that one or two replicas out of 3 can be offline and the 
container is still considered healthy, so I am not sure if it is correct to 
just assume a maintenance copy is offline. If there was a scenario where all 3 
of the replicas of a container were in maintenance, with no other available 
copies, then that would be under-replicated.
   
   This leads to another question - what if a container has 3 replicas from 
alive nodes:
   
   IN_SERVICE
   IN_SERVICE
   IN_MAINTENANCE
   
   Then the IN_MAINTENANCE node gets shutdown, so there are only the 2 
IN_SERVICE nodes left. This is not considered under-replicated and RM will take 
no action on this container, as it believes that the IN_MAINTENANCE copy will 
return.
   
   SCM keeps track of replicas that are for Maintenance nodes and counts them 
as still being available so it can make decisions on this sort of thing.
   
   The concern I have is that Recon shows different counts for under-replicated 
than the RM report and it can cause confusion to users.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to