[ 
https://issues.apache.org/jira/browse/HDDS-13846?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashish Kumar resolved HDDS-13846.
---------------------------------
    Fix Version/s: 2.2.0
       Resolution: Fixed

> Container CLOSED with sequence ID lower than a replica
> ------------------------------------------------------
>
>                 Key: HDDS-13846
>                 URL: https://issues.apache.org/jira/browse/HDDS-13846
>             Project: Apache Ozone
>          Issue Type: Bug
>            Reporter: Ashish Kumar
>            Assignee: Ashish Kumar
>            Priority: Major
>              Labels: pull-request-available
>             Fix For: 2.2.0
>
>
> In below case SCM closes container when sequence id of container is lesser 
> than replica sequence id.
> {code:java}
> ERROR 
> [node1-FixedThreadPoolWithAffinityExecutor-9-0]-org.apache.hadoop.hdds.scm.container.ContainerReportHandler:
>  Container CLOSED with sequence ID lower than a replica: Container #2006 
> (CLOSED, sid=5408) r0 (CLOSED, bcsid=6556, 
> origin=858ea3bc-f045-442f-a34e-6a3e3c6e38f4, non-empty) from dn 
> 858ea3bc-f045-442f-a34e-6a3e3c6e38f4(xxxxxx), proto=containerID: 2006 state: 
> CLOSED used: 4364173816 keyCount: 26 readCount: 411 writeCount: 1215 
> readBytes: 1723858944 writeBytes: 4988468861 deleteTransactionId: 7006 
> blockCommitSequenceId: 6304 originNodeId: 
> "858ea3bc-f045-442f-a34e-6a3e3c6e38f4" replicaIndex: 0 isEmpty: false 
> dataChecksum: 121871427{code}
> h4. Assume below scenario:
> h4. Initial State:
>  * Leader SCM: {{containerInfo(state=CLOSING, sequenceId=6556)}} (in memory)
>  * Follower SCM: {{containerInfo(state=CLOSING, sequenceId=5408)}} (in memory)
> h4. When CLOSE Event Triggers on Leader:
> *Step 1 - Leader Processing:*
> {code:java}
> // Leader SCM: 
> final ContainerInfo oldInfo = containers.getContainerInfo(id); 
> // sequenceId=6556 
> containers.updateState(id, CLOSING, CLOSED); // State: CLOSING → CLOSED 
> transactionBuffer.addToBuffer(containerStore, id, 
> containers.getContainerInfo(id)); 
> // Persists: state=CLOSED, sequenceId=6556{code}
> *Step 2 - Raft Replication:*
> {code:java}
> // Raft log entry contains: CLOSE event for containerID 
> // NO sequenceId information in the Raft log!{code}
> *Step 3 - Follower Processing:*
> {code:java}
> // Follower SCM applies the same transaction: 
> final ContainerInfo oldInfo = containers.getContainerInfo(id); 
> // sequenceId=5408 
> containers.updateState(id, CLOSING, CLOSED); // State: CLOSING → CLOSED 
> transactionBuffer.addToBuffer(containerStore, id, 
> containers.getContainerInfo(id)); 
> // Persists: state=CLOSED, sequenceId=5408 (STALE){code}
> After the CLOSE operation:
>  * Leader DB: {{state=CLOSED, sequenceId=6556}}
>  * Follower DB: {{state=CLOSED, sequenceId=5408}}
> Since recently log is added in HDDS-12409 it is more visible.
> But eventually in follower as well sequence id is getting [updated in 
> memory|https://github.com/apache/ozone/blob/d5be9866648bf65c76e176250d9b34cbac331c84/hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/container/AbstractContainerReportHandler.java#L142](In
>  DB it remains old).
> This is wrt current code logic, as in follower sequence id update is eventual 
> but state is consistent across all 3 SCMs.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to