Hi all suppose that I have this RDBM table
(Entity-attribute-value_model<http://en.wikipedia.org/wiki/Entity-attribute-value_model>
):
col1: entityID
col2: attributeName
col3: value
and I want to use HBASe due to scaling issues:
I know that the only way to access Hbase table is using a single primary
"row key" (cursor). You can get a cursor for a specific "row key", and
iterate the rows one-by-one .
The issue is, that in my case, I want to be able to iterate on all 3
columns. (in RDBM I would do index on all three columns, so I can do a query
on all the columns) for example :
- for a given an entityID I want to get all its attriutes and values
- for a give attributeName and value I want to all the entitiIDS ...
so one idea I had is to build one Hbase table that will hold the data (table
DATA, with entityID as primary index), and 2 "index" tables one with
attributeName as a primary key, and the other one with value
each index table will hold a list of pointers (entityIDs) for the DATA
table.
Is it a reasonable approach ? or is is an 'abuse' of Hbase concepts ?
10x
Yonatan