sodonnel commented on code in PR #5726:
URL: https://github.com/apache/ozone/pull/5726#discussion_r1430672703


##########
hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/SCMCommonPlacementPolicy.java:
##########
@@ -445,6 +445,7 @@ public ContainerPlacementStatus validateContainerPlacement(
       }
     }
     List<Integer> currentRackCount = new ArrayList<>(dns.stream()
+        .filter(d -> !(d.isDecommissioned()))

Review Comment:
   The new replication manager does not report on mis-replication when a 
container is over or under-replicated. It expects those conditions to be fixed 
first.
   
   We have to consider what does mis-replication really mean - the original 
intention was to indicate that the rack tolerence of the container isn't high 
enough. If you have enough racks, but too many nodes on some racks due to 
over-replication, then the container isn't really mis-replicated in my opinion.
   
   I think that one reason the max-per-rack was brought it, was for EC + Rack 
Scatter. Say you have 3 racks and EC 3-2. The idea distribution is 2, 2, 1. 
However 3, 1, 1 is also possible, but its not good, as if you lose the rack 
with 3, the EC data is no longer readable. In that case, the container would be 
rightly mis-replicated.
   
   When you have over-replication, I think it is fine to report the container 
as over-replicated, and so long as the minimum racks is met, then it is not 
mis-replicated. After over-replication has been fixed, then mis-replication can 
come into play.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to