[ 
https://issues.apache.org/jira/browse/HDFS-12742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16224742#comment-16224742
 ] 

Weiwei Yang edited comment on HDFS-12742 at 10/30/17 11:33 AM:
---------------------------------------------------------------

Hi [~shashikant]

Sorry my earlier comment is misleading

bq. Note, this will only purge the deleted keys from KSM, leaving SCM and all 
actual data on DN behind.

I meant to say this will only purge all keys from KSM, but it is not removing 
any data at DN side. The actual deletions still happens in async mode so we 
don't know when they can be actually deleted. My major concern was "I can't 
think any reason that to just purge KSM immediately but not SCM and DN." Can 
you please explain the use-case?

Another point is, if there is an API to let an admin to remove all object keys 
from KSM, it can move every keys to KSM and then background service takes care 
of deleting these objects. This patch will not be necessary.

If this is about an admin purge tool, then it will not need to go through 
current object deleting code paths. Instead, it directly access 
KSM/SCM/Container metadata to remove records and directly remove metadata files 
as well as chunk files. This is like a hdfs format operation, and will be more 
efficient to clean the cluster up.


was (Author: cheersyang):
Hi [~shashikant]

Sorry my earlier comment is misleading

bq. Note, this will only purge the deleted keys from KSM, leaving SCM and all 
actual data on DN behind.

I meant to say this will only purge all keys from KSM, but it is not removing 
any data at DN side. The actual deletions still happens in async mode so we 
don't know when they can be actually deleted. My major concern was "I can't 
think any reason that to just purge KSM immediately but not SCM and DN." Can 
you please explain the use-case?

> Add support for KSM --expunge command
> -------------------------------------
>
>                 Key: HDFS-12742
>                 URL: https://issues.apache.org/jira/browse/HDFS-12742
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>    Affects Versions: HDFS-7240
>            Reporter: Shashikant Banerjee
>            Assignee: Shashikant Banerjee
>             Fix For: HDFS-7240
>
>         Attachments: HDFS-12742-HDFS-7240.001.patch, 
> HDFS-12742-HDFS-7240.002.patch
>
>
> KSM --expunge will delete all the data from the data nodes for all the keys 
> in the KSM db. 
> User will have no control over the deletion.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

Reply via email to