Adar Dembo commented on KUDU-2919:

I recall HDFS' trash feature being fraught with issues, though that was partly 
because it was implemented client-side early on.

Could you describe your expectations regarding a table or partition that's been 
"trashed"? To whom is it visible, and how?

One approach we've discussed is to extend Kudu's MVCC model to include 
metadata. Meaning, just as data is reachable via historical scan until it ages 
out, so could metadata. DROP TABLE or DROP PARTITION wouldn't actually delete 
anything yet; they'd just mark the table/partition as deleted at some 
timestamp. They'd be visible to a GetTableLocations with an appropriate 
timestamp, but hidden to a call with no timestamp.

> It's useful to support trash while drop partition/tables
> --------------------------------------------------------
>                 Key: KUDU-2919
>                 URL: https://issues.apache.org/jira/browse/KUDU-2919
>             Project: Kudu
>          Issue Type: New Feature
>            Reporter: HeLifu
>            Priority: Major
> In order to shorten the recovery time of erroneously dropped partitions or 
> tables, it's useful to support trash functionality. For example, when we use 
> synchronization tools like sqoop to synchronize  data from DBMS to kudu, if 
> the synchronized table(partitions) is dropped unexpectedly, it will take a 
> long time to re-synchronize the data in full.

This message was sent by Atlassian JIRA

Reply via email to