[
https://issues.apache.org/jira/browse/HDFS-12328?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16162393#comment-16162393
]
Weiwei Yang commented on HDFS-12328:
------------------------------------
Hi [~yuanbo]
Thanks for updating the description, I noticed that you proposed to introduce a
new sub command for scm "-txid", well I am not in favor of this. The reason is
the TXs are internal notions, we don't need to expose this to end user. When a
block cannot be deleted after max time of retries, we consider this block is
*corrupted*, from user level, I think we need a *block* level command in SCM.
Some initial thoughts
{code}
// list all corrupted block IDs
hdfs scm -block -list --corrupted
// get detail info of this block as much as possible, where the data locates
// so admin can logon to certain datanode to debug why deletion was failed
hdfs scm -block -info xxx
// delete a certain block
hdfs scm -block -delete xxx
// delete all corrupted blocks
// this will need extra confirmation from keyboard by user
hdfs scm -block -delete --corrupted
{code}
I have set the priority to major, because I don't think this is a super
important feature that must be addressed now (lets get this done as a post
merge task). At present, we have alternative to leverage SQLCli to dump DB info
to debug. Also like [~linyiqun] commented, it might be good to start with
adding corrupted blocks in SCM JMX which is a smaller task and that can help us
understand how big the problem is here.
Thanks
> Ozone: Purge metadata of deleted blocks after max retry times
> -------------------------------------------------------------
>
> Key: HDFS-12328
> URL: https://issues.apache.org/jira/browse/HDFS-12328
> Project: Hadoop HDFS
> Issue Type: Sub-task
> Reporter: Yuanbo Liu
> Assignee: Yuanbo Liu
> Labels: OzonePostMerge
>
> In HDFS-12283, we set the value of count to -1 if blocks cannot be deleted
> after max retry times. We need to provide APIs for admins to purge the "-1"
> metadata manually. Implement these commands:
> list the txids
> {code}
> hdfs scm -txid list -count<number> -retry <number>
> {code}
> delete the txid
> {code}
> hdfs scm -txid delete -id <txid>
> {code}
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]