[ 
https://issues.apache.org/jira/browse/HBASE-4536?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl updated HBASE-4536:
---------------------------------

    Fix Version/s:     (was: 0.92.0)

Turns out this is a bit more complicated than I thought. There are three types 
of deletes:
# version deletes - effective for a specific version of a specific column
# column deletes - effective for all versions of a specific column
# family deletes - effective for all versions of all columns of a family

The first two are sorted before the puts they affect based on their resp. 
timestamps, but after newer puts.
Family deletes, always sort before all versions of all columns.

The problems is deciding when the delete rows (the marker rows) themselves can 
be removed during a major compaction.

For #1 and #2 I can just do version counting, and newer puts will eventually 
push out the delete markers from the store.
With #3 this will never happen as they always sort before all puts of the same 
family, regardless of any timestamp set on them.
Here it is necessary to scan all puts for that family and then decide whether 
the delete needs to be included based on whether the delete had any affect on 
any of the puts in the same family.

Because of this, moving out of 0.92 as changes will be bigger. Put back if you 
think otherwise.

I still think that timetravel is an important feature of HBase and incomplete 
if it cannot include deleted rows.

                
> Allow CF to retain deleted rows
> -------------------------------
>
>                 Key: HBASE-4536
>                 URL: https://issues.apache.org/jira/browse/HBASE-4536
>             Project: HBase
>          Issue Type: Sub-task
>          Components: regionserver
>    Affects Versions: 0.92.0
>            Reporter: Lars Hofhansl
>            Assignee: Lars Hofhansl
>             Fix For: 0.94.0
>
>
> Parent allows for a cluster to retain rows for a TTL or keep a minimum number 
> of versions.
> However, if a client deletes a row all version older than the delete tomb 
> stone will be remove at the next major compaction (and even at memstore flush 
> - see HBASE-4241).
> There should be a way to retain those version to guard against software error.
> I see two options here:
> 1. Add a new flag HColumnDescriptor. Something like "RETAIN_DELETED".
> 2. Folds this into the parent change. I.e. keep minimum-number-of-versions of 
> versions even past the delete marker.
> #1 would allow for more flexibility. #2 comes somewhat naturally with parent 
> (from a user viewpoint)
> Comments? Any other options?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to