[
https://issues.apache.org/jira/browse/HDDS-9254?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Stephen O'Donnell resolved HDDS-9254.
-------------------------------------
Fix Version/s: 1.4.0
Resolution: Fixed
> Legacy replication manager uses mismatched replicas as replication sources
> --------------------------------------------------------------------------
>
> Key: HDDS-9254
> URL: https://issues.apache.org/jira/browse/HDDS-9254
> Project: Apache Ozone
> Issue Type: Bug
> Reporter: Ethan Rose
> Assignee: Ethan Rose
> Priority: Major
> Labels: pull-request-available
> Fix For: 1.4.0
>
>
> Consider a case where SCM has a CLOSED container and the replica states are
> CLOSED, CLOSED, QUASI. In the pull replication model, RM will send all 3 of
> these replicas to the datanode to use as replication sources. The DN will do
> a random shuffle and pick one to replicate. If it chooses the QUASI-CLOSED
> replica, the next iteration of RM will see replicas CLOSED, CLOSED, QUASI,
> QUASI. RM will issue the same command since the CLOSED replicas are still
> under replicated, but now the odds of the DN's random shuffle choosing a
> quasi closed replica are increased. This process can repeat until the cluster
> is filled with a quasi-closed replica on each datanode. This can bring the
> cluster into the stuck state described in HDDS-8536.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]