[
https://issues.apache.org/jira/browse/HDDS-13846?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Ashish Kumar resolved HDDS-13846.
---------------------------------
Fix Version/s: 2.2.0
Resolution: Fixed
> Container CLOSED with sequence ID lower than a replica
> ------------------------------------------------------
>
> Key: HDDS-13846
> URL: https://issues.apache.org/jira/browse/HDDS-13846
> Project: Apache Ozone
> Issue Type: Bug
> Reporter: Ashish Kumar
> Assignee: Ashish Kumar
> Priority: Major
> Labels: pull-request-available
> Fix For: 2.2.0
>
>
> In below case SCM closes container when sequence id of container is lesser
> than replica sequence id.
> {code:java}
> ERROR
> [node1-FixedThreadPoolWithAffinityExecutor-9-0]-org.apache.hadoop.hdds.scm.container.ContainerReportHandler:
> Container CLOSED with sequence ID lower than a replica: Container #2006
> (CLOSED, sid=5408) r0 (CLOSED, bcsid=6556,
> origin=858ea3bc-f045-442f-a34e-6a3e3c6e38f4, non-empty) from dn
> 858ea3bc-f045-442f-a34e-6a3e3c6e38f4(xxxxxx), proto=containerID: 2006 state:
> CLOSED used: 4364173816 keyCount: 26 readCount: 411 writeCount: 1215
> readBytes: 1723858944 writeBytes: 4988468861 deleteTransactionId: 7006
> blockCommitSequenceId: 6304 originNodeId:
> "858ea3bc-f045-442f-a34e-6a3e3c6e38f4" replicaIndex: 0 isEmpty: false
> dataChecksum: 121871427{code}
> h4. Assume below scenario:
> h4. Initial State:
> * Leader SCM: {{containerInfo(state=CLOSING, sequenceId=6556)}} (in memory)
> * Follower SCM: {{containerInfo(state=CLOSING, sequenceId=5408)}} (in memory)
> h4. When CLOSE Event Triggers on Leader:
> *Step 1 - Leader Processing:*
> {code:java}
> // Leader SCM:
> final ContainerInfo oldInfo = containers.getContainerInfo(id);
> // sequenceId=6556
> containers.updateState(id, CLOSING, CLOSED); // State: CLOSING → CLOSED
> transactionBuffer.addToBuffer(containerStore, id,
> containers.getContainerInfo(id));
> // Persists: state=CLOSED, sequenceId=6556{code}
> *Step 2 - Raft Replication:*
> {code:java}
> // Raft log entry contains: CLOSE event for containerID
> // NO sequenceId information in the Raft log!{code}
> *Step 3 - Follower Processing:*
> {code:java}
> // Follower SCM applies the same transaction:
> final ContainerInfo oldInfo = containers.getContainerInfo(id);
> // sequenceId=5408
> containers.updateState(id, CLOSING, CLOSED); // State: CLOSING → CLOSED
> transactionBuffer.addToBuffer(containerStore, id,
> containers.getContainerInfo(id));
> // Persists: state=CLOSED, sequenceId=5408 (STALE){code}
> After the CLOSE operation:
> * Leader DB: {{state=CLOSED, sequenceId=6556}}
> * Follower DB: {{state=CLOSED, sequenceId=5408}}
> Since recently log is added in HDDS-12409 it is more visible.
> But eventually in follower as well sequence id is getting [updated in
> memory|https://github.com/apache/ozone/blob/d5be9866648bf65c76e176250d9b34cbac331c84/hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/container/AbstractContainerReportHandler.java#L142](In
> DB it remains old).
> This is wrt current code logic, as in follower sequence id update is eventual
> but state is consistent across all 3 SCMs.
>
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]