[jira] Commented: (HBASE-1249) Rearchitecting of server, client, API, key format, etc for 0.20

Jonathan Gray (JIRA) Tue, 28 Apr 2009 09:05:56 -0700

    [ 
https://issues.apache.org/jira/browse/HBASE-1249?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12703696#action_12703696
 ]


Jonathan Gray commented on HBASE-1249:
--------------------------------------

{quote}
+ Is the Family object needed? The old array of arrays where columns were 
compound of family and qualifier would seem to be more compact?
{quote}
Yes, it is more compact.  And insanely difficult to actually use and reason 
about.  We have had a good amount of back and forth about whether to use more 
classes or not.  This applies to 880 especially.  My general argument is 
(client-side):
- The HTable API needs to be smaller, it is very wide and it makes things 
complicated.
- The HTable API does not reflect server-side implementation.
- Dealing with family:column and binary is a nightmare.  Delimiters generally 
suck.
- We can retain a slimmer, direct byte[] based HTable API for ease-of-use, and 
then use classes for more complex/custom queries.
- A hierarchical client API that maps to a server-side set of classes mirrors 
queries to implementation and makes it more clear to the user which things have 
a cost (like each family you add as a parameter is much different than adding 
an additional column parameter to a family).
- Instantiating objects on the client-side for Gets is basically to be 
considered "free".  Compared to the amount of comparators, utility functions, 
and allocation we currently have to deal with family:column byte[]s, there's no 
difference in performance on the client-side using nested objects and lists.  
Performance does/will come from getting rid of server-side 
instantiation/allocation (zero-copy reads, optimized Gets, not using TreeMap 
after TreeMap).
- Methods of all sorts basically being a single serialized object, many times 
just lists of KeyValues, for RPC in the interest of moving towards a language 
agnostic client protocol.

Should bring this conversation to IRC if people want to chime in... we 
definitely want others input, this is a big decision, but the API is much more 
usable like this and server implementations much easier to reason about.

{quote}
+ How do I read GetXServer? Is that the client-side GetX that has been 
deserialized Server-side? The GetXServer.compareTo will take into consideration 
the TimeRange? I think I like it. Why do I need result 2 and 3 out of the 
compareTo? Whats wrong with the compareTo working like any other comparator 
returning < 0, 0, or > 0? If 0, we add to the result. If > 0, we've gone past 
whatever our context, storefile or store, and then in the loop we just move on 
to the next storefile or store. Shouldn't be compareTo if returning different 
kind of results.
{quote}
No problem with renaming it.  It does break the normal convention.  It can't be 
-1, 0, 1 because it has four different things it needs to say.  You say, if > 
0, we've gone past storefile or store.  That means in every query, we will 
always have to check every storefile.  Many of our queries (GetColumns, GetTop 
at least) have early-outs where they do not need to look into any other 
storefiles.  So this compare needs to be able to say, i'm done, return to 
client now.

{quote}
+ What about deletes? They are orthogonal to this compareTo test? They are a 
running list that we bring along with our results as we do currently? Looks 
like you have this thing called NewDeletes that GetX knows about?
+ How does your DeleteSet work? How will it delete with different types (e.g. 
what do you add to this Set? Deletes? If so, how you going to have the Put 
something is supposed to Delete match in the comparator? Currently I have a 
special comparator that ignores types... that won't be good enough if need to 
consider family, column and plain deletes).
{quote}
Best answered by Erik... My understanding is basically we keep deletes to the 
side.  DeleteFamily's are special cased.  Since deletes only apply to older 
storefiles, when reading a storefile we insert deletes to a newDeletes list and 
actually use the oldDeletes list to check if things are deleted.

But the way things are processed is neat.  It's a sorted merge down the 
oldDeletes list.  So you do not have to check against a bunch of things, or do 
a (log n) treemap operation.

{quote}
+ We're changing how filters work?
{quote}
I would like to.  None of the code is done.  I have no issue with delaying that 
for 0.21, but since things are being ripped apart thought we might get it in.  
The biggest change outlined here is adding new language similar to the 
compareTo above... done, return now.  Would allow for efficient limit, offset 
queries and other such things.  As you say, in current code it is the same.  
The other thing with filters is it would be nice to be able to use dynamic 
classes.  So we might just put off filter changes to 0.21 and give them more 
attention.

> Rearchitecting of server, client, API, key format, etc for 0.20
> ---------------------------------------------------------------
>
>                 Key: HBASE-1249
>                 URL: https://issues.apache.org/jira/browse/HBASE-1249
>             Project: Hadoop HBase
>          Issue Type: Improvement
>            Reporter: Jonathan Gray
>            Priority: Blocker
>             Fix For: 0.20.0
>
>         Attachments: HBASE-1249-Example-v1.pdf, HBASE-1249-Example-v2.pdf, 
> HBASE-1249-GetQuery-v1.pdf, HBASE-1249-GetQuery-v2.pdf, 
> HBASE-1249-GetQuery-v3.pdf, HBASE-1249-GetQuery-v4.pdf, 
> HBASE-1249-StoreFile-v1.pdf, HBASE-1249-StoreFile-v4.pdf
>
>
> To discuss all the new and potential issues coming out of the change in key 
> format (HBASE-1234): zero-copy reads, client binary protocol, update of API 
> (HBASE-880), server optimizations, etc...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HBASE-1249) Rearchitecting of server, client, API, key format, etc for 0.20

Reply via email to