sodonnel opened a new pull request, #4118: URL: https://github.com/apache/ozone/pull/4118
## What changes were proposed in this pull request? In EC, a container is considered missing and under replicated if it has lost enough replicas that offline reconstruction is not possible. If any of the remaining replicas for this container are on a datanode that is being decommissioned, the decommissioning will not proceed. All the containers on that node must be restored to proper replication for it to finish decommissioning, but the code will not copy the replica of the missing container to a different node. There are 3 parts to fixing this problem: In DatanodeAdminMonitorImpl, inside the method checkContainersReplicatedOnNode, we use a call to ECContainerReplicaCount.isSufficientlyReplicated() to decide if the container is replicated ok or not. Even if we address 1 and 2 above, this is still a problem, as the container is un-recoverable. For EC container in the decommission monitor, perhaps we need a different check. Ie, that for the replica on the host being checked, it is also available on another IN_SERVICE host. From a decommission point of view, we don't care if the entire EC container is sufficiently replicated or not - we just care that the replica on the current host has a copy elsewhere. In ECReplicationCheckHandler, we deliberately skip adding "unrecoverable" containers to the under replicated queue as we previously believed there was no point in adding them. They cannot be recovered anyway. However this decommission issue is specific to EC, so we should allow the container to make it onto the under-replicated queue if it has decommissioning or maintenance indexes. In ECUnderReplicationHandle we need to check that the decommissioning indexes are copied ok, even if the container is otherwise unrecoverable. ## What is the link to the Apache JIRA https://issues.apache.org/jira/browse/HDDS-7666 ## How was this patch tested? New unit tests added. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
