siddhantsangwan commented on code in PR #3962:
URL: https://github.com/apache/ozone/pull/3962#discussion_r1023238442
##########
hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/container/replication/health/ECReplicationCheckHandler.java:
##########
@@ -95,6 +96,24 @@ public boolean handle(ContainerCheckRequest request) {
public ContainerHealthResult checkHealth(ContainerCheckRequest request) {
ContainerInfo container = request.getContainerInfo();
Set<ContainerReplica> replicas = request.getContainerReplicas();
+
+ /*
+ Remove UNHEALTHY replicas because they are unavailable. They could be a
+ reason for under replication but should not be a reason for over
+ replication.
+
+ For example, consider the following set of replicas for an EC 3-2
container:
+ Replica Index 1: Closed
+ Replica Index 2: Closed
+ Replica Index 3: Closed, Unhealthy (2 replicas for this index)
+ Replica Index 4: Unhealthy
+ Replica Index 5: Closed
+
+ This is a case of under replication because index 4 is unavailable. Index
+ 3 is not considered over replicated because its second copy is unhealthy.
+ */
+ replicas.removeIf(containerReplica -> containerReplica.getState() ==
Review Comment:
Great catch. I've removed this code and instead added logic to consider
unhealthy replicas in `ECContainerReplicaCount`. Now it only exists in that one
place.
> I think we need to also update the ContainerCheckRequest object to wrap
all the sets in Collections.UnmodifableSet(...) and then this operation would
get an error.
Done
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]