[
https://issues.apache.org/jira/browse/HDDS-12409?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
ASF GitHub Bot updated HDDS-12409:
----------------------------------
Labels: pull-request-available (was: )
> Log an error before increasing the sequence id of a CLOSED container in SCM
> ---------------------------------------------------------------------------
>
> Key: HDDS-12409
> URL: https://issues.apache.org/jira/browse/HDDS-12409
> Project: Apache Ozone
> Issue Type: Bug
> Components: SCM
> Reporter: Siddhant Sangwan
> Assignee: Peter Lee
> Priority: Major
> Labels: pull-request-available
>
> There have been situations where a replica reports with higher BCSID than SCM
> knows for a container that is already CLOSED. This ideally should not happen,
> but can happen because of bugs in the Datanode side applyTransaction and
> ratis group removal path.
> Currently, when handling a container report in SCM:
> {code}
> if (isHealthy(replicaProto::getState)) {
> if (containerInfo.getSequenceId() <
> replicaProto.getBlockCommitSequenceId()) {
> containerInfo.updateSequenceId(
> replicaProto.getBlockCommitSequenceId());
> }
> {code}
> we check if the replica is healthy and if the container's sequence is lower
> than the replica's. We then update the sequence id:
> {code}
> public void updateSequenceId(long sequenceID) {
> assert (isOpen() || state == HddsProtos.LifeCycleState.QUASI_CLOSED);
> sequenceId = max(sequenceID, sequenceId);
> }
> {code}
> There's an assert statement there because we don't expect to update a CLOSED
> container's sequence id, but if the code is built without -enableassertions,
> this will not fail.
> I propose to log an error message here to make this situation visible in the
> logs. We need further discussion on whether updating the sequence id of a
> CLOSED container should be allowed at all by default - should we crash the
> SCM and allow it only once an admin has reviewed the situation and explicitly
> set a configuration that this update should be allowed? This jira is
> restricted to logging, a separate jira should be created to change the
> default behaviour.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]