
No, that is not what I am suggesting. 

Perhaps I am missing something. Was the OP interested in cells or in row 

Two different issues. 

On Oct 23, 2012, at 1:35 PM, lars hofhansl <> wrote:

> HBase has time range queries. You can say "give me the data as of time T" or 
> "give me the data between X and Y". How far back you want to retain your data 
> is specified via TTL and VERSIONS.
> But... If you delete the data at T+X (X>0), a query as of time T won't return 
> anything, even though at T the data was still there.
> If you don't use TTL and/or VERSIONS in HBase you won't need this feature.
> If you do use these you're doing so because you want get to the older data. 
> And you delete stuff, chances are you want KEEP_DELETED_CELLS enabled.
> So within the boundaries specified by TTL/VERSIONS you can get to the data as 
> of any time.
> By your logic nobody should use TTL/VERSIONS, which is nonsense.
> ________________________________
> From: Michael Segel <>
> To: lars hofhansl <> 
> Cc: "" <> 
> Sent: Tuesday, October 23, 2012 4:41 AM
> Subject: Re: How to config hbase0.94.2 to retain deleted data
> "Deleted cells are still subject to TTL and there will never be more than 
> "maximum number of versions" deleted cells. A new "raw" scan options returns 
> all deleted rows and the delete markers. "
> This is different from the idea suggested by the OP. Here deleted cells still 
> get deleted. Just that when the compaction flag comes along, its told to 
> ignore them. 
> So if I say a column can have 3 versions (cells) then if I insert another 
> value for that row:column key, I push that deleted cell down the stack.  
> Enough times, its gone. 
> In theory, this feature would be useful if I wanted an OLTP implementation on 
> top of HBase. It would allow the transaction to bridge a compaction cycle. 
> However, that's pretty much it. 
> This feature doesn't translate well beyond this. 
> It also begs the following:  How do I handle a long transaction (OLTP)  
> timeouts, and isolation levels? 
> If you look at this at the row level... definitely not a good idea. Think of 
> fat clogging an artery.
> On Oct 23, 2012, at 12:22 AM, lars hofhansl <> wrote:
>> Without it you cannot do correct as-of-time queries when it comes to deletes.
>> -- Lars
>> From: Michael Segel <>
>> To:; lars hofhansl <> 
>> Sent: Monday, October 22, 2012 9:18 PM
>> Subject: Re: How to config hbase0.94.2 to retain deleted data
>>> Curious, why do you think this is better than using the keep-deleted-cells 
>>> feature?
>>> (It might well be, just curious)
>> Ok... so what exactly does this feature mean? 
>> Suppose I have 500 rows within a region. I set this feature to be true. 
>> I do a massive delete and there are only 50 rows left standing. 
>> So if I do a count of the number of rows in the region, I see only 50, yet 
>> if I compact the table, its still full. 
>> Granted I'm talking about rows and not cells, but the idea is the same. IMHO 
>> you're asking for more headaches that you solve. 
>> KISS would suggest that moving deleted data in to a different table would 
>> yield better performance in the long run. 
>> On Oct 21, 2012, at 7:23 PM, lars hofhansl <> wrote:
>>> That'd work too. Requires the regionservers to make remote updates to other 
>>> regionservers, though. And you have to trap each and every change (Put, 
>>> Delete, Increment, Append, RowMutations, etc)
>>> Curious, why do you think this is better than using the keep-deleted-cells 
>>> feature?
>>> (It might well be, just curious)
>>> -- Lars
>>> ----- Original Message -----
>>> From: Michael Segel <>
>>> To:
>>> Cc: 
>>> Sent: Sunday, October 21, 2012 4:34 PM
>>> Subject: Re: How to config hbase0.94.2 to retain deleted data
>>> I would suggest that you use your coprocessor to copy the data to a 
>>> 'backup' table when you mark them for delete. 
>>> Then as major compaction hits, the rows are deleted from the main table, 
>>> but still reside undeleted in your delete table. 
>>> Call it a history table. 
>>> On Oct 21, 2012, at 3:53 PM, yun peng <> wrote:
>>>> Hi, All,
>>>> I want to retain all deleted key-value pairs in hbase. I have tried to
>>>> config HColumnDescript as follow to make it return deleted.
>>>>   public void postOpen(ObserverContext<RegionCoprocessorEnvironment> e) {
>>>>     HTableDescriptor htd = e.getEnvironment().getRegion().getTableDesc();
>>>>     HColumnDescriptor hcd = htd.getFamily(Bytes.toBytes("cf"));
>>>>     hcd.setKeepDeletedCells(true);
>>>>     hcd.setBlockCacheEnabled(false);
>>>>   }
>>>> However, it does not work for me, as when I issued a delete and then query
>>>> by an older timestamp, the old data does not show up.
>>>> hbase(main):119:0> put 'usertable', "key1", 'cf:c1', "v1", 99
>>>> hbase(main):120:0> put 'usertable', "key1", 'cf:c1', "v2", 101
>>>> hbase(main):121:0> delete 'usertable', "key1", 'cf:c1', 100
>>>> hbase(main):122:0> get 'usertable', 'key1', {COLUMN => 'cf:c1', TIMESTAMP
>>>> => 99, VERSIONS => 4}
>>>> COLUMN                CELL
>>>> 0 row(s) in 0.0040 seconds
>>>> hbase(main):123:0> get 'usertable', 'key1', {COLUMN => 'cf:c1', TIMESTAMP
>>>> => 100, VERSIONS => 4}
>>>> COLUMN                CELL
>>>> 0 row(s) in 0.0050 seconds
>>>> hbase(main):124:0> get 'usertable', 'key1', {COLUMN => 'cf:c1', TIMESTAMP
>>>> => 101, VERSIONS => 4}
>>>> COLUMN                CELL
>>>> cf:c1                timestamp=101, value=v2
>>>> 1 row(s) in 0.0050 seconds
>>>> Note this is a new feature in 0.94.2
>>>> (HBASE-4536<>),
>>>> I did not find too many sample code online, so... any one here has
>>>> experience in using HBASE-4536. How should one config
>>>> hbase to enable this feature in hbase?
>>>> Thanks
>>>> Yun

Reply via email to