Deferred deletes
----------------
Key: HBASE-2834
URL: https://issues.apache.org/jira/browse/HBASE-2834
Project: HBase
Issue Type: New Feature
Reporter: Andrew Purtell
Tangentally mentioned in a blog post, James Hamilton talks about deferred
deletes:
{quote}
If you have an application error, administrative error, or database
implementation bug that losses data, then it is simply gone unless you have an
offline copy. This, by the way, is why I'm a big fan of deferred delete. This
is a technique where deleted items are marked as deleted but not garbage
collected until some days or preferably weeks later. Deferred delete is not
full protection but it has saved my butt more than once and I'm a believer. See
On Designing and Deploying Internet-Scale Services
(http://mvdirona.com/jrh/talksAndPapers/JamesRH_Lisa.pdf) for more detail.
{quote}
(See
http://perspectives.mvdirona.com/2010/04/07/StonebrakerOnCAPTheoremAndDatabases.aspx)
Because deletes -- at least, after the initial write has been flushed from
memstore -- are tombstones, deferred delete in HBase could be supported if
somehow tombstones could be invalidated, an undelete operation in effect. This
could be accomplished by adding support for tombstones for deletes. Would
complicate major compaction but otherwise not touch much. A typical use case
might be "resurrect any data deleted from _ts1_ to _ts2_ ", a period of 4 hours
when an application error was operative. In this case a new write would be
issued to the table that is a tombstone covering any deletes over that period
of time. Users would defer major compactions until safe checkpoint periods.
Such guarantees could optionally be extended to the memstoe by using tombstones
there as well. But it would probably be sufficient to provide guidance that
forcing a flush is necessary to insure edits are persisted in a way that
allows for undeletion.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.