[jira] Commented: (HBASE-1249) Rearchitecting of server, client, API, key format, etc for 0.20

Jonathan Gray (JIRA) Tue, 17 Mar 2009 23:40:15 -0700

    [ 
https://issues.apache.org/jira/browse/HBASE-1249?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12682935#action_12682935
 ]


Jonathan Gray commented on HBASE-1249:
--------------------------------------

We need to do some testing on that.  Scanning through the deletes in the 
memcache might be pretty fast, regardless.  However I think it sounds like a 
good idea and the basis for some more thoughts.

And yeah, there should probably be no such thing as a DeleteRow on the server.  
And this is especially the case with locality groups as you'd need to seek to 
the start of the row every time before seeking down to your family.

But in thinking more about memcache deletes... when we flush the memcache, we 
can guarantee that none of the values being flushed have been deleted (if we do 
as above, applying deletes to the memcache).  So we have a list of deletes that 
apply to older store files.  Then we start a new memcache.

When we read in the newest storefile, we actually know that we can process it 
without looking at any deletes except those that are in the new memcache.  The 
deletes in this storefile aren't needed until the second newest is looked at.  
And at that point we can read them in in bulk from the previous storefile 
that's already been opened.  Can even compare stamps from the deletes to the 
storefile stamps to possible query stamps to early out.  This is a far cry from 
how things are now... deletes are interspersed and duplicated everywhere.

It does seem to make sense to have the deletes order above where they apply, 
but then we have to check those sections first before reading?  Well come to 
think of it, what could make sense is to order them below.  The only time we 
actually have deletes in a storefile is when they need to be applied to the 
older storefiles.  So, we can scan these deletes at the end, once we have 
reached past what we wanted (and still need to read additional storefiles) we 
can scan and seek for deletes pertaining to this row/family/column, if there 
are any.

Those deletes are added to the in-memory deleteset for the remaining storefiles.

Any rewriting of files must enforce deletions across them, and files must be 
sequential in age if not all are combined.

So, DeleteRow and DeleteFamily would take no time parameters, and would be 
stored with the time of deletion.  Their KeyValue will sort at the end of the 
row, meaning you need to scan to this spot any time you reach the end of what 
you're reading from that store's row and need to read the next.

DeleteColumn would use now by default, or you could specify a stamp and it 
would delete everything <= that stamp.  This _could_ sort at the end of the 
column, but is there any point?  It should probably be at the end of the row, 
this is where you have to seek to look for a DeleteFamily anyways.

Delete would be the same thing.  Sorted at the end of the row.  Just need to 
get the deleteset and comparators right so they can do the matching well for 
these different delete types against different cell KeyValues.

Might make sense to have a DeleteRow in this case, would be less work in the 
case of locality groups.  But not a big deal either way really.

> Rearchitecting of server, client, API, key format, etc for 0.20
> ---------------------------------------------------------------
>
>                 Key: HBASE-1249
>                 URL: https://issues.apache.org/jira/browse/HBASE-1249
>             Project: Hadoop HBase
>          Issue Type: Improvement
>            Reporter: Jonathan Gray
>            Priority: Blocker
>             Fix For: 0.20.0
>
>
> To discuss all the new and potential issues coming out of the change in key 
> format (HBASE-1234): zero-copy reads, client binary protocol, update of API 
> (HBASE-880), server optimizations, etc...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HBASE-1249) Rearchitecting of server, client, API, key format, etc for 0.20

Reply via email to