[jira] [Created] (HDDS-12409) Log an error before increasing the sequence id of a CLOSED container in SCM

Siddhant Sangwan (Jira) Mon, 24 Feb 2025 22:46:09 -0800

Siddhant Sangwan created HDDS-12409:
---------------------------------------


             Summary: Log an error before increasing the sequence id of a 
CLOSED container in SCM
                 Key: HDDS-12409
                 URL: https://issues.apache.org/jira/browse/HDDS-12409
             Project: Apache Ozone
          Issue Type: Bug
          Components: SCM
            Reporter: Siddhant Sangwan


There have been situations where a replica reports with higher BCSID than SCM 
knows for a container that is already CLOSED. This ideally should not happen, 
but can happen because of bugs in the Datanode side applyTransaction and ratis 
group removal path.

Currently, when handling a container report in SCM:
{code}
    if (isHealthy(replicaProto::getState)) {
      if (containerInfo.getSequenceId() <
          replicaProto.getBlockCommitSequenceId()) {
        containerInfo.updateSequenceId(
            replicaProto.getBlockCommitSequenceId());
      }

{code}
we check if the replica is healthy and if the container's sequence is lower 
than the replica's. We then update the sequence id:
{code}
  public void updateSequenceId(long sequenceID) {
    assert (isOpen() || state == HddsProtos.LifeCycleState.QUASI_CLOSED);
    sequenceId = max(sequenceID, sequenceId);
  }
{code}

There's an assert statement there because we don't expect to update a CLOSED 
container's sequence id, but if the code is built without -enableassertions, 
this will not fail.

I propose to log an error message here to make this situation visible in the 
logs. We need further discussion on whether updating the sequence id of a 
CLOSED container should be allowed at all by default - should we crash the SCM 
and allow it only once an admin has reviewed the situation and explicitly set a 
configuration that this update should be allowed? This jira is restricted to 
logging, a separate jira should be created to change the default behaviour.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Created] (HDDS-12409) Log an error before increasing the sequence id of a CLOSED container in SCM

Reply via email to