xBis7 commented on code in PR #5651:
URL: https://github.com/apache/ozone/pull/5651#discussion_r1406876671


##########
hadoop-ozone/recon/src/main/java/org/apache/hadoop/ozone/recon/fsck/ContainerHealthStatus.java:
##########
@@ -48,8 +48,12 @@ public class ContainerHealthStatus {
     int repFactor = container.getReplicationConfig().getRequiredNodes();
     this.healthyReplicas = healthyReplicas
         .stream()
-        .filter(r -> !r.getState()
-            .equals((ContainerReplicaProto.State.UNHEALTHY)))
+        // Filter unhealthy replicas and
+        // replicas belonging to out-of-service nodes.
+        .filter(r ->
+            (!r.getDatanodeDetails().isDecommissioned() &&
+             !r.getDatanodeDetails().isMaintenance() &&

Review Comment:
   As far as I understand, a node doesn’t go offline until its replicas have 
been copied to another node. While ENTERING_MAINTENANCE or DECOMMISSIONING 
container replicas are added or removed as needed to maintain proper 
replication. The container will be under-replicated until copies have been made 
and the node successfully becomes offline.
   
   Once that is done, the container is correctly replicated, has 3 healthy and 
available replicas and 1 offline. SCM doesn’t report any under-replicated or 
over-replicated containers but Recon 
   
   - for master, counts 1 over-replicated because it sees 4 replicas (no 
distinction between online - offline).
   - for this patch, 0 count.
   
   When the offline datanode is stopped, SCM doesn’t count unhealthy containers 
and
   
   - for master, Recon no longer counts 1 over-replicated container.
   - for this patch, no change in Recon.
   
   When having 3 healthy replicas and 3 nodes, and 1 of them prepares to go 
offline, the container is temporarily under-replicated until a replica copy is 
created. Once that is done, then the container is considered properly 
replicated and the node goes offline. There should no longer be an issue of 
under-replication.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to