sodonnel commented on code in PR #5726:
URL: https://github.com/apache/ozone/pull/5726#discussion_r1430672703
##########
hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/SCMCommonPlacementPolicy.java:
##########
@@ -445,6 +445,7 @@ public ContainerPlacementStatus validateContainerPlacement(
}
}
List<Integer> currentRackCount = new ArrayList<>(dns.stream()
+ .filter(d -> !(d.isDecommissioned()))
Review Comment:
The new replication manager does not report on mis-replication when a
container is over or under-replicated. It expects those conditions to be fixed
first.
We have to consider what does mis-replication really mean - the original
intention was to indicate that the rack tolerence of the container isn't high
enough. If you have enough racks, but too many nodes on some racks due to
over-replication, then the container isn't really mis-replicated in my opinion.
I think that one reason the max-per-rack was brought it, was for EC + Rack
Scatter. Say you have 3 racks and EC 3-2. The idea distribution is 2, 2, 1.
However 3, 1, 1 is also possible, but its not good, as if you lose the rack
with 3, the EC data is no longer readable. In that case, the container would be
rightly mis-replicated.
When you have over-replication, I think it is fine to report the container
as over-replicated, and so long as the minimum racks is met, then it is not
mis-replicated. After over-replication has been fixed, then mis-replication can
come into play.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]