[ 
https://issues.apache.org/jira/browse/HDFS-11922?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16037537#comment-16037537
 ] 

Anu Engineer commented on HDFS-11922:
-------------------------------------

[~cheersyang] Thanks for you comment. You are absolutely right, it is SCM that 
needs to do the actual delete. However it is KSM that knows for the first time 
that a key is being deleted, or the delete operation is done in the KSM. Once 
it is done, the actual delete blocks has to be done by calling into SCM. Since 
this problem straddles both KSM and SCM, I just tagged it as KSM. The majority 
of the work as you pointed out is in SCM, my reasoning was that since the work 
originates in KSM, I tagged it as  KSM. Please  feel free to change the tag to 
SCM if you think it is more appropriate.


> Ozone: KSM: Garbage collect deleted blocks
> ------------------------------------------
>
>                 Key: HDFS-11922
>                 URL: https://issues.apache.org/jira/browse/HDFS-11922
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>          Components: ozone
>            Reporter: Anu Engineer
>            Priority: Critical
>
> We need to garbage collect deleted blocks from the Datanodes. There are two 
> cases where we will have orphaned blocks. One is like the classical HDFS, 
> where someone deletes a key and we need to delete the corresponding blocks.
> Another case, is when someone overwrites a key -- an overwrite can be treated 
> as a delete and a new put -- that means that older blocks need to be GC-ed at 
> some point of time. 
> Couple of JIRAs has discussed this in one form or another -- so consolidating 
> all those discussions in this JIRA. 
> HDFS-11796 -- needs to fix this issue for some tests to pass 
> HDFS-11780 -- changed the old overwriting behavior to not supporting this 
> feature for time being.
> HDFS-11920 - Once again runs into this issue when user tries to put an 
> existing key.
> HDFS-11781 - delete key API in KSM only deletes the metadata -- and relies on 
> GC for Datanodes. 
> When we solve this issue, we should also consider 2 more aspects. 
> One, we support versioning in the buckets, tracking which blocks are really 
> orphaned is something that KSM will do. So delete and overwrite at some point 
> needs to decide how to handle versioning of buckets.
> Two, If a key exists in a closed container, then it is immutable, hence the 
> strategy of removing the key might be more complex than just talking to an 
> open container.
> cc : [~xyao], [~cheersyang], [~vagarychen], [~msingh], [~yuanbo], 
> [~szetszwo], [~nandakumar131]
>  



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to