[ 
https://issues.apache.org/jira/browse/HBASE-10756?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13938376#comment-13938376
 ] 

stack commented on HBASE-10756:
-------------------------------

[~manukranthk] Sweet.

Here is short version.  If you want more, just say; we can do a phone call and 
I'll catch you up.

In our little hbase ecosystem, there are as many type systems and type 
serializations as there are tools on top.  Kiji, Kite, Phoenix (and others such 
as Splice Machine) have all come up w/ their own way of serializing types into 
HBase and then beyond this of serializing in a manner that preserves order when 
values are used as row key parts: e.g. flipping sign bit so negative numbers 
sort behind positive numbers, etc.  Kiji and Kite depend on Avro type 
serializations with customizations.  Phoenix has its own system.

Each has its own way of specifying the key format, usually as a serialized data 
structure specified variously (as 'special columns' in phoenix or via avro 
customizations in kite).

Chatting offline, the thought was that getting all these systems interacting, 
we need to agree on the first step, a serialization format (later we can come 
along and agree on how to spec rowkeys, schema evolution...).  So, can we agree 
on how to serialize and int, sql types, and complex types into a cell?

In an effort at a serialization esperanto, [~ndimiduk] built the OrderedTypes 
and the content of the types package in hbase original toward the end of last 
year as a system that Hive might move to (this project is on hold apparently at 
the moment).  Could this effort be the common format we all use?

Phoenix has said already that it will move to Nicks' system.  The Kite folks 
are looking at it to see if it could serve as the serialization basis for kite. 
 Would it work for Presto [~manukranthk]?

It is an amalgam of the Orderly project, phoenix serialzations, and sqllite.  
See here for more launching the project http://search-hadoop.com/m/JfPZzujFjZ  
and here for an overview: 
https://issues.apache.org/jira/secure/attachment/12589798/hbase%20data%20types%20WIP.pdf
 (all from HBASE-8089). 

Good on you.

> Adding Data Types and Structured Row Keys in 0.89-fb HBase
> ----------------------------------------------------------
>
>                 Key: HBASE-10756
>                 URL: https://issues.apache.org/jira/browse/HBASE-10756
>             Project: HBase
>          Issue Type: New Feature
>          Components: Usability
>    Affects Versions: 0.89-fb
>            Reporter: Manukranth Kolloju
>            Assignee: Manukranth Kolloju
>             Fix For: 0.89-fb
>
>
> As an extension to some of the work done on Presto + HBase side, and also 
> inspired by some of the work done on open source and Pheonix, introducing 
> data types and structured row keys will enable the data base(hbase) to 
> de-couple database level optimizations from the application level schema. The 
> attempt is to provide a table definition & specification to define the row 
> key structure which can be composed as a composite struct composed of 
> primitive data types.
> The data base can make intelligent decisions of how to interpret the data. 
> For instance, having an understanding of the the structure of row key will 
> hint the database about the parts of the data that are valuable and can use 
> that information to construct indexes/bloom filters based on these parts of 
> the row key.
> This can be extended to the column qualifiers and Nested Types as well.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to