Hi,

I have a many to many relationship that I am trying to model in hbase, and I 
want to be sure I am not missing anything so please let me know or point to the 
right documentation.

Let's say I have an A to B many to many relationship, the query parameter takes 
A unique id and returns all the B uniqueids related to A with their properties 
and values.

The first solution I found is having two tables: one with the rowKey equal to 
A's unique id, the table column identifiers are equal to B's unique ids related 
to A, the second table has its rowKeys equal to B unique ids and its columns 
contain the property values. So the query is two steps, it first does a get on 
A to collect all the B uniqueIds and then does a second get on the B passing as 
a parameter an array of B rowkeys. When I run the second query, I can get a 
latency much longer on the first query and then good low latency on subsequent 
queries with same parameter. I believe that's a caching issue...

The second solution is having one table with a composite rowkey equal to A 
uniqueid + B uniqueid, I will then have duplicate B uniqueid rows. But when I 
do a scan on the just the first part of the rowKey (A uniqueid) the response 
time and latency is more consistent and better (smaller).

So, my questions are threefold: 1) which way is the best, 2) what is the 
performance difference between a scan and a get with multiple rowkeys (I think 
scan is faster because the data is not or less "distributed") and 3) how can we 
make the get with multiple rowkeys more consistent?

Thank you for your help,
Marc

This electronic message is intended to be for the use only of the named 
recipient, and may contain information that is confidential or privileged.  If 
you are not the intended recipient, you are hereby notified that any 
disclosure, copying, distribution or use of the contents of this message is 
strictly prohibited.  If you have received this message in error or are not the 
named recipient, please notify us immediately by contacting the sender at the 
electronic mail address noted above, and delete and destroy all copies of this 
message.  Thank you.

Reply via email to