[jira] [Commented] (HDDS-7728) Block should be safely deleted from the containers if they are instructed from OM and containers are in missing state.

Stephen O'Donnell (Jira) Thu, 21 Sep 2023 08:15:08 -0700


    [ 
https://issues.apache.org/jira/browse/HDDS-7728?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17767605#comment-17767605
 ]


Stephen O'Donnell commented on HDDS-7728:
-----------------------------------------

I suspect there are various other scenarios where orphan blocks can appear in 
datanode containers when writes fail or a client gets killed, or with EC a bad 
final stripe on a block.

If there are other such scenarios, then having RM delete the replica with the 
largest delete transaction ID does not solve the problem completely, and 
another recon based solution is needed anyway.

It is also not as simple to add this logic in RM as you may thing, as RM needs 
to balance other things:

1. For replication, the source is currently picked as the least loaded DN, not 
a random one. This is integral to the throttling design.
2. For replica delete, we need to consider the placement policy. Then we may 
want to consider removing the replica on a DN with the least free space.
3. Over replication handling also kicks in around the balancer, where it 
decides which replica to copy and then which to remove, so you need to consider 
that too.

Then a natural extension of the problem is whether to check the delete 
transaction of normally replicated containers to see if the delete transaction 
is behind in some of them and if it is, treat the replica as under replicated 
and start the process of making new copies and removing the bad one.

All these rules and complexity add up, and if it was to completely solve the 
problem for all orphan block scenarios then it might be a good idea, but I am 
not convinced it does. If there are other ways orphan blocks can creep in, then 
we need another solution anyway. If that is the case, then we are better to 
avoid adding all these rules to RM and implement the overall solution in a 
single place.

> Block should be safely deleted from the containers if they are instructed 
> from OM and containers are in missing state.
> ----------------------------------------------------------------------------------------------------------------------
>
>                 Key: HDDS-7728
>                 URL: https://issues.apache.org/jira/browse/HDDS-7728
>             Project: Apache Ozone
>          Issue Type: Improvement
>          Components: SCM
>    Affects Versions: 1.3.0
>            Reporter: Uma Maheswara Rao G
>            Assignee: Ashish Kumar
>            Priority: Major
>
> Currently when OM instructs to delete the blocks and if containers are in 
> missing state, deletion may not be processed properly. This Jira to track 
> this requirement and implement to safe deletion os blocks what ever state 
> they are on. Otherwise containers would never get cleaned up even though all 
> blocks in that files deleted. 
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (HDDS-7728) Block should be safely deleted from the containers if they are instructed from OM and containers are in missing state.

Reply via email to