subject:"Programming practices for implementing composite row keys"

Programming practices for implementing composite row keys

2013-09-05 Thread praveenesh kumar

Hello people,

I have a scenario which requires creating composite row keys for my hbase
table.

Basically it would be entity1,entity2,entity3.

Search would be based by entity1 and then entity2 and 3.. I know I can do
row start-stopscan on entity1 first and then put row filters on entity2
and entity3.

My question is what are the best programming principles to implement these
keys.

1. Just use simple delimiters entity1:entity2:entity3.

2. Create complex datatypes like java structures. I don't know if anyone
uses structures as keys and if they do, can someone please highlight me for
which scenarios they would be good fit. Does they fit good for this
scenario.

3. What are the pros and cons for both 1 and 2, when it comes for data
retrieval.

4. My entity1 can be negative also. Does it make any special difference
when hbase ordering is concerned. How can I tackle this scenario.

Any help on how to implement composite row keys would be highly helpful. I
want to understand how the community deals with implementing composite row
keys.

Regards
Praveenesh

Re: Programming practices for implementing composite row keys

2013-09-05 Thread Ted Yu

For #2 and #4, see HBASE-8693 'DataType: provide extensible type API' which
has been integrated to 0.96

Cheers

On Thu, Sep 5, 2013 at 7:14 AM, Shahab Yunus shahab.yu...@gmail.com wrote:

My 2 cents:

1- Yes, that is one way to do it. You can also use fixed length for every
attribute participating in the composite key. HBase scan would be more
fitting to this pattern as well, I believe (?) It's a trade-off basically
between space (all that padding increasing the key size) versus
complexities involved in deciding and handling a delimiter and consequent
parsing of keys etc.

2- I personally have not heard about this. As far as I understand, this
goes against the whole idea of HBase scanning and prefix and fuzzy filters
will not be possible this way. This should not be followed.

3- See replies to 1 2

4- The sorting of the keys, by default, is binary comparator. It is a bit
tricky as far as I know and the last I checked. Some tips here:

http://stackoverflow.com/questions/17248510/hbase-filters-not-working-for-negative-integers

Can you normalize them (or take an absolute) before reading and writing (of
course at the cost of performance) if it is possible i.e. keys with same
amount but different magnitude cannot exist as well as different entities.
This depends on your business logic and type/nature of data.

Regards,
Shahab

On Thu, Sep 5, 2013 at 10:03 AM, praveenesh kumar praveen...@gmail.com
wrote:

Hello people,

I have a scenario which requires creating composite row keys for my hbase
table.

Basically it would be entity1,entity2,entity3.

Search would be based by entity1 and then entity2 and 3.. I know I can do
row start-stopscan on entity1 first and then put row filters on entity2
and entity3.

My question is what are the best programming principles to implement
these
keys.

1. Just use simple delimiters entity1:entity2:entity3.

2. Create complex datatypes like java structures. I don't know if anyone
uses structures as keys and if they do, can someone please highlight me
for
which scenarios they would be good fit. Does they fit good for this
scenario.

3. What are the pros and cons for both 1 and 2, when it comes for data
retrieval.

4. My entity1 can be negative also. Does it make any special difference
when hbase ordering is concerned. How can I tackle this scenario.

Any help on how to implement composite row keys would be highly helpful.
I
want to understand how the community deals with implementing composite
row
keys.

Regards
Praveenesh

Re: Programming practices for implementing composite row keys

2013-09-05 Thread Shahab Yunus

Ah! I didn't know about HBASE-8693. Good information. Thanks Ted.

Regards,
Shahab

On Thu, Sep 5, 2013 at 10:53 AM, Ted Yu yuzhih...@gmail.com wrote:

For #2 and #4, see HBASE-8693 'DataType: provide extensible type API' which
has been integrated to 0.96

Cheers

On Thu, Sep 5, 2013 at 7:14 AM, Shahab Yunus shahab.yu...@gmail.com
wrote:

My 2 cents:

2- I personally have not heard about this. As far as I understand, this
goes against the whole idea of HBase scanning and prefix and fuzzy
filters
will not be possible this way. This should not be followed.

3- See replies to 1 2

4- The sorting of the keys, by default, is binary comparator. It is a bit
tricky as far as I know and the last I checked. Some tips here:

http://stackoverflow.com/questions/17248510/hbase-filters-not-working-for-negative-integers

Can you normalize them (or take an absolute) before reading and writing
(of
course at the cost of performance) if it is possible i.e. keys with same
amount but different magnitude cannot exist as well as different
entities.
This depends on your business logic and type/nature of data.

Regards,
Shahab

On Thu, Sep 5, 2013 at 10:03 AM, praveenesh kumar praveen...@gmail.com
wrote:

Hello people,

I have a scenario which requires creating composite row keys for my
hbase
table.

Basically it would be entity1,entity2,entity3.

Search would be based by entity1 and then entity2 and 3.. I know I can
do
row start-stopscan on entity1 first and then put row filters on
entity2
and entity3.

My question is what are the best programming principles to implement
these
keys.

1. Just use simple delimiters entity1:entity2:entity3.

2. Create complex datatypes like java structures. I don't know if
anyone
uses structures as keys and if they do, can someone please highlight me
for
which scenarios they would be good fit. Does they fit good for this
scenario.

3. What are the pros and cons for both 1 and 2, when it comes for data
retrieval.

4. My entity1 can be negative also. Does it make any special
difference
when hbase ordering is concerned. How can I tackle this scenario.

Any help on how to implement composite row keys would be highly
helpful.
I
want to understand how the community deals with implementing composite
row
keys.

Regards
Praveenesh

Re: Programming practices for implementing composite row keys

2013-09-05 Thread Doug Meil


Greetings, 

Other food for thought on some case studies on composite rowkey design are
in the refguide:

http://hbase.apache.org/book.html#schema.casestudies






On 9/5/13 12:15 PM, Anoop John anoop.hb...@gmail.com wrote:

Hi
  Have a look at Phoenix[1].  There you can define a composite RK
model and it handles the -ve number ordering.  Also the scan model u
mentioned will be well supported with start/stop RK on entity1 and
using SkipScanFilter
for others.

-Anoop-

[1] https://github.com/forcedotcom/phoenix


On Thu, Sep 5, 2013 at 8:58 PM, Shahab Yunus shahab.yu...@gmail.com
wrote:

 Ah! I didn't know about HBASE-8693. Good information. Thanks Ted.

 Regards,
 Shahab


 On Thu, Sep 5, 2013 at 10:53 AM, Ted Yu yuzhih...@gmail.com wrote:

  For #2 and #4, see HBASE-8693 'DataType: provide extensible type API'
 which
  has been integrated to 0.96
 
  Cheers
 
 
  On Thu, Sep 5, 2013 at 7:14 AM, Shahab Yunus shahab.yu...@gmail.com
  wrote:
 
   My 2 cents:
  
   1- Yes, that is one way to do it. You can also use fixed length for
 every
   attribute participating in the composite key. HBase scan would be
more
   fitting to this pattern as well, I believe (?) It's a trade-off
 basically
   between space (all that padding increasing the key size) versus
   complexities involved in deciding and handling a delimiter and
 consequent
   parsing of keys etc.
  
   2- I personally have not heard about this. As far as I understand,
this
   goes against the whole idea of HBase scanning and prefix and fuzzy
  filters
   will not be possible this way. This should not be followed.
  
   3- See replies to 1  2
  
   4- The sorting of the keys, by default, is binary comparator. It is
a
 bit
   tricky as far as I know and the last I checked. Some tips here:
  
  
 
 
http://stackoverflow.com/questions/17248510/hbase-filters-not-working-for
-negative-integers
  
   Can you normalize them (or take an absolute) before reading and
writing
  (of
   course at the cost of performance) if it is possible i.e. keys with
 same
   amount but different magnitude cannot exist as well as different
  entities.
   This depends on your business logic and type/nature of data.
  
   Regards,
   Shahab
  
  
   On Thu, Sep 5, 2013 at 10:03 AM, praveenesh kumar 
 praveen...@gmail.com
   wrote:
  
Hello people,
   
I have a scenario which requires creating composite row keys for
my
  hbase
table.
   
Basically it would be entity1,entity2,entity3.
   
Search would be based by entity1 and then entity2 and 3.. I know I
 can
  do
row start-stopscan on entity1 first and then put row filters on
  entity2
and entity3.
   
My question is what are the best programming principles to
implement
   these
keys.
   
1. Just use simple delimiters entity1:entity2:entity3.
   
2. Create complex datatypes like java structures. I don't know if
  anyone
uses structures as keys and if they do, can someone please
highlight
 me
   for
which scenarios they would be good fit. Does they fit good for
this
scenario.
   
3. What are the pros and cons for both 1 and 2, when it comes for
 data
retrieval.
   
4. My entity1 can be negative also. Does it make any special
  difference
when hbase ordering is concerned. How can I tackle this scenario.
   
Any help on how to implement composite row keys would be highly
  helpful.
   I
want to understand how the community deals with implementing
 composite
   row
keys.
   
Regards
Praveenesh

Re: Programming practices for implementing composite row keys

2013-09-05 Thread Anoop John

Hi
Have a look at Phoenix[1]. There you can define a composite RK
model and it handles the -ve number ordering. Also the scan model u
mentioned will be well supported with start/stop RK on entity1 and
using SkipScanFilter
for others.

-Anoop-

[1] https://github.com/forcedotcom/phoenix

On Thu, Sep 5, 2013 at 8:58 PM, Shahab Yunus shahab.yu...@gmail.com wrote:

Ah! I didn't know about HBASE-8693. Good information. Thanks Ted.

Regards,
Shahab

On Thu, Sep 5, 2013 at 10:53 AM, Ted Yu yuzhih...@gmail.com wrote:

For #2 and #4, see HBASE-8693 'DataType: provide extensible type API'
which
has been integrated to 0.96

Cheers

On Thu, Sep 5, 2013 at 7:14 AM, Shahab Yunus shahab.yu...@gmail.com
wrote:

My 2 cents:

1- Yes, that is one way to do it. You can also use fixed length for
every
attribute participating in the composite key. HBase scan would be more
fitting to this pattern as well, I believe (?) It's a trade-off
basically
between space (all that padding increasing the key size) versus
complexities involved in deciding and handling a delimiter and
consequent
parsing of keys etc.

2- I personally have not heard about this. As far as I understand, this
goes against the whole idea of HBase scanning and prefix and fuzzy
filters
will not be possible this way. This should not be followed.

3- See replies to 1 2

4- The sorting of the keys, by default, is binary comparator. It is a
bit
tricky as far as I know and the last I checked. Some tips here:

http://stackoverflow.com/questions/17248510/hbase-filters-not-working-for-negative-integers

Can you normalize them (or take an absolute) before reading and writing
(of
course at the cost of performance) if it is possible i.e. keys with
same
amount but different magnitude cannot exist as well as different
entities.
This depends on your business logic and type/nature of data.

Regards,
Shahab

On Thu, Sep 5, 2013 at 10:03 AM, praveenesh kumar
praveen...@gmail.com
wrote:

Hello people,

I have a scenario which requires creating composite row keys for my
hbase
table.

Basically it would be entity1,entity2,entity3.

Search would be based by entity1 and then entity2 and 3.. I know I
can
do
row start-stopscan on entity1 first and then put row filters on
entity2
and entity3.

My question is what are the best programming principles to implement
these
keys.

1. Just use simple delimiters entity1:entity2:entity3.

2. Create complex datatypes like java structures. I don't know if
anyone
uses structures as keys and if they do, can someone please highlight
me
for
which scenarios they would be good fit. Does they fit good for this
scenario.

3. What are the pros and cons for both 1 and 2, when it comes for
data
retrieval.

4. My entity1 can be negative also. Does it make any special
difference
when hbase ordering is concerned. How can I tackle this scenario.

Any help on how to implement composite row keys would be highly
helpful.
I
want to understand how the community deals with implementing
composite
row
keys.

Regards
Praveenesh

Programming practices for implementing composite row keys

Re: Programming practices for implementing composite row keys

Re: Programming practices for implementing composite row keys

Re: Programming practices for implementing composite row keys

Re: Programming practices for implementing composite row keys

5 matches

Site Navigation

Mail list logo

Footer information