[
https://issues.apache.org/jira/browse/HDDS-12305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17933188#comment-17933188
]
Sumit Agrawal commented on HDDS-12305:
--------------------------------------
There are multiple error scenario where its possible to have BCSID=0 and time
at which SCM treat container is empty and trigger delete. So its not possible
to avoid delete from the fact available at SCM at point of time to delete
container being empty,
To handle the scenario, below are alternative approach,
# If all replica reported are empty, mark the container state at SCM as
DELETED (no data replica present) *(at the moment of check) – Existing behavior*
# If later on, some of the replicas reported have data (where SCM state is
DELETED / DELETING), move back the container state from DELETED / DELETING to
CLOSED/OPEN/QUASI_CLOSED as reported state by the replica.
# Avoid sending *FORCE_DELETE* for the empty container deletion purpose or
container deletion purpose (to be used only in case of Replica deletion in
over-replication case)
https://issues.apache.org/jira/browse/HDDS-12421 will handle above solution to
move back container to QUASI_CLOSED/CLOSED state and do not perform force
delete.
Impact: This will lead to orphan container but will avoid any data loss.
> Avoid marking container as DELETED if it was due to EMPTY with BCSID=0
> containers
> ---------------------------------------------------------------------------------
>
> Key: HDDS-12305
> URL: https://issues.apache.org/jira/browse/HDDS-12305
> Project: Apache Ozone
> Issue Type: Sub-task
> Reporter: Uma Maheswara Rao G
> Assignee: Sumit Agrawal
> Priority: Major
>
> BCSID=0 means, there might be some errors in apply transaction. When first
> writing chunk only we will create containers. By default, BCSID will be 0. As
> applyTransactions played, this BCSID will keep getting incremented. A
> container being reported with BCSID=0 means, either node failed right after
> creating the container or failed to apply the transaction.
> It may be a safe check to make SCM not to mark the whole container as DELETED
> based on EMPTY BSCID=0 container reports from SCMs. Instead, we can just
> delete only that specific replica.
> In one of the offline discussions, @Tsz Sze brought up this point. Not to
> lose track of this point, log the Jira here. Let's discuss if there are other
> cases to take care of it.
> [~sodonnell] [~siddhant] [~sumitagrawal] [~swamirishi]
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]