siddhantsangwan opened a new pull request, #5794: URL: https://github.com/apache/ozone/pull/5794
## What changes were proposed in this pull request? A `QUASI_CLOSED` container may have some `UNHEALTHY` replicas with the same sequence id as the container, while there are no healthy replicas with the correct sequence id. Such `UNHEALTHY` replicas cannot be deleted and must be kept around. If the DN hosting such an `UNHEALTHY` replica is put in decommission, then decommission will stay blocked because the `UNHEALTHY` cannot be lost, but at the same time RM currently does nothing about it. We try to do something about these vulnerable `UNHEALTHY` replicas in this PR so that decommission can be successful. Changes introduced: 1. A new handler, `VulnerableUnhealthyReplicasHandler`, leverages the existing `replicaCount.getVulnerableUnhealthyReplicas` API to find such `UNHEALTHY` replicas. If found, the container is marked as under replicated and added to the under replication queue. 2. The under replicated container is then handled in `RatisUnderReplicationHandler`. It tries to find a new target DN for each `UNHEALTHY` replica and sends replicate commands. The logic is similar to what we have already done for legacy RM. Some additional changes were required to correctly find out the used and excluded nodes to pass into the placement policy API for finding target DNs. 3. Changes to the decommission monitor so that both RMs use the `replicaSet.isHealthyEnoughForOffline` API. The third point above basically solves [ReplicationManager: Unhealthy replicas of a sufficiently replicated container can block decommissioning](https://issues.apache.org/jira/browse/HDDS-9383). If required, this can be split off into its own PR since this one is quite large. Need to add some more tests to `TestRatisUnderReplicationHandler`. ## What is the link to the Apache JIRA https://issues.apache.org/jira/browse/HDDS-9592 ## How was this patch tested? New tests. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
