[ 
https://issues.apache.org/jira/browse/HBASE-1249?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12693596#action_12693596
 ] 

Erik Holstad commented on HBASE-1249:
-------------------------------------

When bringing DeleteFamily into the mix it creates problems with the current 
layout, cause every time you need to merge the deletes from the previous 
storefile with the current deletes and every time you compare those deletes 
with the current position that you are looking at in the current storefile you 
need to compare those timestamps. There are a couple of ways around this as I 
see it.

1. Only letting the user set the timestamp for the DeleteFamily to now, where 
now can be System.currentTimeMillis() or some user generated now. This would 
mean the current memCache would be cleaned and all other storefiles considered 
to be deleted, for this row and family. This would mean that you only need to 
do one check for every storefile to see if there is a DeleteFamily entry in 
there and you will know that all data in that storefile is ok and that you 
don't need to look in any more storefiles.

2. Do the check if there is a DeleteFamily in the deletes and have 2 different 
methods taking care of the cases where you have a DeleteFamily entry and when 
you don't. The downside of this is that you still have to pay the cost if you 
do have a DeleteFamily, worst case you have to do these 2 checks for every 
entry in every storefile.

3. Keep the DeleteFamily sorted at the timestamp where it belongs, so that all 
deletes would be sorted in timestamp order before column. This is a rather big 
change, because it means that also the puts would have to be sorted this way 
for it to make any sense. Another advantage of this approach would be that 
earlying out from a query with a timerange "filter" would be more effective.

I personally like the 3 option the most, but I can see people not liking it 
because it sort of redefines what HBase is, so I think that number 1 is the 
best option and after that number 2.

Would love to get some input in this matter to see if there is anything that 
people might have against number 1. Otherwise I will move on with the process 
of implementing that option.

Regards Erik

> Rearchitecting of server, client, API, key format, etc for 0.20
> ---------------------------------------------------------------
>
>                 Key: HBASE-1249
>                 URL: https://issues.apache.org/jira/browse/HBASE-1249
>             Project: Hadoop HBase
>          Issue Type: Improvement
>            Reporter: Jonathan Gray
>            Priority: Blocker
>             Fix For: 0.20.0
>
>         Attachments: HBASE-1249-Example-v1.pdf, HBASE-1249-Example-v2.pdf, 
> HBASE-1249-GetQuery-v1.pdf, HBASE-1249-GetQuery-v2.pdf, 
> HBASE-1249-GetQuery-v3.pdf, HBASE-1249-StoreFile-v1.pdf
>
>
> To discuss all the new and potential issues coming out of the change in key 
> format (HBASE-1234): zero-copy reads, client binary protocol, update of API 
> (HBASE-880), server optimizations, etc...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to