Hi,

I would like to setup an Hbase table that would provide users the ability to perform selects only (get and scans). We don't have a need for users to perform inserts or updates at the moment. But yes i will have to load/insert the data into the tables before users can perform selects.

I can have the row key as a composite, having "brand:date:users" where brand is a 4 letter code for all brands, date is DD-MM-YYYY and users is the metric (how many people bought a certain brand). This will give me rather tall table which will have millions of rows and less columns (maybe 2) at most.

or

Would it be better to have a wider table with the row key as users:date only and have the brands become a column family. There are many brands to track on a daily basis. People using my table will need to select a particular brand, a group or all brands to retrieve and display data.

If i recollect is it recommended to have tall tables if one is not doing atomic operations? Does a get/scan in Hbase perform any row locking? Having a tall table means more data can be spread out over regions on different nodes in my cluster. I have a small test cluster of 3 nodes at the moment.

I intend to have other metrics (quantity, price etc) and types (brand, products, campaigns etc). So my table will be gorw fast and have lots of data.

If i use the type (brand, campaign, product) as part of the row key then my inserts will be in the millions over time but if i make the type a column family then i will end up with wider entries and less rows.

Thanks,
Usman






--
Using Opera's revolutionary email client: http://www.opera.com/mail/

Reply via email to