[
https://issues.apache.org/jira/browse/HDDS-9321?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Siddhant Sangwan updated HDDS-9321:
-----------------------------------
Priority: Critical (was: Major)
> LegacyReplicationManager: Unhealthy replicas of a sufficiently replicated
> container can block decommissioning
> -------------------------------------------------------------------------------------------------------------
>
> Key: HDDS-9321
> URL: https://issues.apache.org/jira/browse/HDDS-9321
> Project: Apache Ozone
> Issue Type: Sub-task
> Components: SCM
> Reporter: Siddhant Sangwan
> Assignee: Siddhant Sangwan
> Priority: Critical
>
> Mix of quasi-closed and unhealthy replicas blocks decommission even if
> sufficiently replicated.
> a. Caused when only some of the replicas hit the error during write.
> b. Can be fixed by removing this check:
> {code}
> if (!replicaSet.isHealthy()) {
> if (LOG.isDebugEnabled()) {
> unhealthyIDs.add(cid);
> }
> if (unhealthy < CONTAINER_DETAILS_LOGGING_LIMIT
> {code}
> However, simply removing that check is not a complete solution. We need to
> try and preserve any UNHEALTHY replicas that have the greatest Sequence ID.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]