[ 
https://issues.apache.org/jira/browse/HBASE-6205?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13398544#comment-13398544
 ] 

Jesse Yates commented on HBASE-6205:
------------------------------------

I don't know if this feature needs to be so far reaching. We already are 
getting preservation of the hfiles on delete via HBASE-5547 and as stack says, 
HDFS supports a /trash directory with configurable deletion period  
(http://hadoop.apache.org/common/docs/r1.0.3/hdfs_design.html#Space+Reclamation).
 

It would make more sense that when we delete the table that we just store a 
list of the current files in the table in a single file added to /trash, so we 
know which files to include and which to exclude when recovering the table. 
This solves the issues of periodic cleanup, minimizing possible locations of 
old hfiles and still lets you reasonably recover a table.

On the other hand, I would argue that this feature is a bit excessive. This 
isn't a feature traditional tables support and is a bit excessive IMO. You 
should be _very_ careful when dropping tables and not just do things 
willy-nilly on production (in particular, you can make sure all production runs 
via (possibly generated) scripts so you can validate and not 'accidentally' 
drop tables moving from dev to production). The traditional though here is that 
you can recover if you are fast enough from the local fs (in this case hdfs 
/trash) or from a backup (everyone takes those periodically, right?). Am I 
missing something?

+1 on making this configurable.
                
> Support an option to keep data of dropped table for some time
> -------------------------------------------------------------
>
>                 Key: HBASE-6205
>                 URL: https://issues.apache.org/jira/browse/HBASE-6205
>             Project: HBase
>          Issue Type: New Feature
>    Affects Versions: 0.94.0, 0.96.0
>            Reporter: chunhui shen
>            Assignee: chunhui shen
>             Fix For: 0.96.0
>
>         Attachments: HBASE-6205.patch, HBASE-6205v2.patch, 
> HBASE-6205v3.patch, HBASE-6205v4.patch, HBASE-6205v5.patch
>
>
> User may drop table accidentally because of error code or other uncertain 
> reasons.
> Unfortunately, it happens in our environment because one user make a mistake 
> between production cluster and testing cluster.
> So, I just give a suggestion, do we need to support an option to keep data of 
> dropped table for some time, e.g. 1 day
> In the patch:
> We make a new dir named .trashtables in the rood dir.
> In the DeleteTableHandler, we move files in dropped table's dir to trash 
> table dir instead of deleting them directly.
> And Create new class TrashCleaner which will clean dropped tables if it is 
> time out with a period check.
> Default keep time for dropped tables is 1 day, and check period is 1 hour.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to