siddhantsangwan opened a new pull request, #4083:
URL: https://github.com/apache/ozone/pull/4083

   ## What changes were proposed in this pull request?
   
   Scenario :
   In a cluster(13 datanodes) having bucket of EC replication RS-3-2-1024K, for 
a container of a key which is in CLOSED state, 2 of the replicas in the cluster 
are made UNHEALTHY.
   ```
   Replica Index 1: Closed
   Replica Index 2: Unhealthy
   Replica Index 3: Closed
   Replica Index 4: Unhealthy
   Replica Index 5: Closed
   ```
   
   Expected behaviour - The container should be identified as under replicated 
due to the unhealthy containers. Once is it fully replicated, the unhealthy 
containers should be removed.
   Observed behaviour - The UNHEALTHY replicas are still in the container after 
12 hours and the under replication is not identified or fixed.
   The reason is that the container is handled in 
ClosedWithMismatchedReplicasHandler as all replica states does not match the 
the container state (CLOSED with 2 UNHEALTHY). However we should not consider 
UNHEALTHY in `ClosedWithMismatchedReplicasHandler`, as it's a special state. 
It's intended for a CLOSED container with OPEN or CLOSING replicas.
   
   There are two changes proposed here:
   1. `ClosedWithMismatchedReplicasHandler` sends close commands for only 
CLOSING or OPEN replicas.2. 
   2. The handler always returns true so that other handlers in the chain can 
fix issues such as under replication. Consider a scenario:
   
   CLOSED EC 3-2 container with 5 replicas: 
   ```
   CLOSED, CLOSED, CLOSED, UNHEALTHY, CLOSING
   ```
   This container is under replicated. Currently, 
`ClosedWithMismatchedReplicasHandler` will send close command for the CLOSING 
replica and return true. With this change, the handler will still send a close 
command but will return false. The under replication handler can now fix under 
replication by reconstructing using the 3 CLOSED replicas. 
   
   ## What is the link to the Apache JIRA
   
   https://issues.apache.org/jira/browse/HDDS-7640
   
   ## How was this patch tested?
   
   Added tests


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to