[
https://issues.apache.org/jira/browse/HDDS-13928?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Ivan Andika updated HDDS-13928:
-------------------------------
Description:
Thought of one case that might cause orphan blocks
1. A CLOSED Container 1 contains replicas in [DN1, DN2, DN3]
2. A delete transaction is created for Container 1 blocks, but not yet sent
3. DN1 is marked as DEAD and SCM removes the the container replica
4. SCM replicates the Container 1 to DN4
5. Delete commands are sent to [DN2, DN3, DN4]
6. DN2, DN3, DN4 finished the deletion and acknowleged to SCM
7. SCM removes the delete transaction
8. DN1 comes back alive (resurrected)
9. The overreplicated replica DN4 is removed, which results back to the
original 3 replicas in step 1.
Notice that since the deletion transaction has been removed, the undeleted
blocks in DN1 will be orphaned and will never be deleted.
We need a way to handle this case.
was:
Thought of one case that might cause orphan blocks
1. A CLOSED Container 1 contains replicas in [DN1, DN2, DN3]
2. A delete transaction is created for Container 1 blocks, but not yet sent
3. DN1 is marked as DEAD and SCM removes the the container replica
4. SCM replicates the Container 1 to DN4
5. Delete commands are sent to [DN2, DN3, DN4]
6. DN2, DN3, DN4 finished the deletion and acknowleged to SCM
7. SCM removes the delete transaction
8. DN1 comes back alive
9. The overreplicated replica DN4 is removed
Notice that since the deletion transaction has been removed, the undeleted
blocks in DN1 will be orphaned and will never be deleted.
We need a way to handle this case.
> Cleanup orphan blocks on datanodes that are resurrected
> -------------------------------------------------------
>
> Key: HDDS-13928
> URL: https://issues.apache.org/jira/browse/HDDS-13928
> Project: Apache Ozone
> Issue Type: Sub-task
> Reporter: Ivan Andika
> Priority: Major
>
> Thought of one case that might cause orphan blocks
> 1. A CLOSED Container 1 contains replicas in [DN1, DN2, DN3]
> 2. A delete transaction is created for Container 1 blocks, but not yet sent
> 3. DN1 is marked as DEAD and SCM removes the the container replica
> 4. SCM replicates the Container 1 to DN4
> 5. Delete commands are sent to [DN2, DN3, DN4]
> 6. DN2, DN3, DN4 finished the deletion and acknowleged to SCM
> 7. SCM removes the delete transaction
> 8. DN1 comes back alive (resurrected)
> 9. The overreplicated replica DN4 is removed, which results back to the
> original 3 replicas in step 1.
> Notice that since the deletion transaction has been removed, the undeleted
> blocks in DN1 will be orphaned and will never be deleted.
> We need a way to handle this case.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]