[jira] [Commented] (HBASE-11292) Add an "undelete" operation

James Taylor (JIRA) Sat, 26 Jul 2014 15:06:06 -0700

    [ 
https://issues.apache.org/jira/browse/HBASE-11292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14075496#comment-14075496
 ]


James Taylor commented on HBASE-11292:
--------------------------------------

bq. I think there is no difference in terms of the read because you need to 
read the previous "delete" put as well to construct the "undelete" put while 
the other way need to read&write more data.
[~jeffreyz], the Undelete is issued by the client, the same client that issued 
the Delete. It's done to undo the effects of a failed transaction (i.e. like a 
compensating transaction). You don't need to lookup the prior state of the row. 
You just need to issue an Undelete with the same rowkey and timestamp as the 
Delete.

bq. Yet another way to look at it is: why undelete at all? Ignoring failed 
transactions needs to be implemented anyway, so instead of undo one could just 
continue to ignore the transaction.
[~lhofhansl], I agree, this is a fallback. It puts more pressure on being able 
to "prune" this invalid list. If that can be figured out and done efficiently, 
than the need for an Undelete decreases. Maybe that should be discussed in a 
separate JIRA?

Assuming that an Undelete may only be done for a Delete (i.e. an Undelete can 
not be undeleted), is it feasible then? Besides requiring another set of Bloom 
filters, are their more gotchas? Is it too heavy a burden to have another set 
of Bloom filters? Any other options?

> Add an "undelete" operation
> ---------------------------
>
>                 Key: HBASE-11292
>                 URL: https://issues.apache.org/jira/browse/HBASE-11292
>             Project: HBase
>          Issue Type: New Feature
>          Components: Deletes
>            Reporter: Gary Helmling
>              Labels: Phoenix
>
> While column families can be configured to keep deleted cells (allowing time 
> range queries to still retrieve those cells), deletes are still somewhat 
> unique in that they are irreversible operations.  Once a delete has been 
> issued on a cell, the only way to "undelete" it is to rewrite the data with a 
> timestamp newer than the delete.
> The idea here is to add an "undelete" operation, that would make it possible 
> to cancel a previous delete.  An undelete operation will be similar to a 
> delete, in that it will be written as a marker ("tombstone" doesn't seem like 
> the right word).  The undelete marker, however, will sort prior to a delete 
> marker, canceling the effect of any following delete.
> In the absence of a column family configured to KEEP_DELETED_CELLS, we can't 
> be sure if a prior delete marker and the effected cells have already been 
> garbage collected.  In this case (column family not configured with 
> KEEP_DELETED_CELLS) it may be necessary for the server to reject undelete 
> operations to avoid creating the appearance of a client contact for undeletes 
> that can't reliably be honored.
> I think there are additional subtleties of the implementation to be worked 
> out, but I'm also interested in a broader discussion of interest in this 
> capability.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (HBASE-11292) Add an "undelete" operation

Reply via email to