[
https://issues.apache.org/jira/browse/HDFS-11922?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16036631#comment-16036631
]
Weiwei Yang commented on HDFS-11922:
------------------------------------
Hi [~anu]
Thanks for filing this, it's pretty important. I noticed you add KSM tag in
title, do you think this is a work in KSM layer? I thought this is in SCM,
because it's SCM communicates with datanodes, it seems more straightforward
that to let SCM scan orphan blocks in a backend thread, and send container
report response to datanodes. Then datanodes can work on cleaning up
corresponding container/chunks/files. Please let me know if I miss anything.
Thanks.
> Ozone: KSM: Garbage collect deleted blocks
> ------------------------------------------
>
> Key: HDFS-11922
> URL: https://issues.apache.org/jira/browse/HDFS-11922
> Project: Hadoop HDFS
> Issue Type: Sub-task
> Components: ozone
> Reporter: Anu Engineer
> Priority: Critical
>
> We need to garbage collect deleted blocks from the Datanodes. There are two
> cases where we will have orphaned blocks. One is like the classical HDFS,
> where someone deletes a key and we need to delete the corresponding blocks.
> Another case, is when someone overwrites a key -- an overwrite can be treated
> as a delete and a new put -- that means that older blocks need to be GC-ed at
> some point of time.
> Couple of JIRAs has discussed this in one form or another -- so consolidating
> all those discussions in this JIRA.
> HDFS-11796 -- needs to fix this issue for some tests to pass
> HDFS-11780 -- changed the old overwriting behavior to not supporting this
> feature for time being.
> HDFS-11920 - Once again runs into this issue when user tries to put an
> existing key.
> HDFS-11781 - delete key API in KSM only deletes the metadata -- and relies on
> GC for Datanodes.
> When we solve this issue, we should also consider 2 more aspects.
> One, we support versioning in the buckets, tracking which blocks are really
> orphaned is something that KSM will do. So delete and overwrite at some point
> needs to decide how to handle versioning of buckets.
> Two, If a key exists in a closed container, then it is immutable, hence the
> strategy of removing the key might be more complex than just talking to an
> open container.
> cc : [~xyao], [~cheersyang], [~vagarychen], [~msingh], [~yuanbo],
> [~szetszwo], [~nandakumar131]
>
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]