This has been discussed recently on the mailing list, see those two threads for example:
http://search-hadoop.com/m/amq9c1OaV9z1/wide+tall+hbase+table&subj=Insert+into+tall+table+50+faster+than+wide+table and http://search-hadoop.com/m/zbKmE14o0Js/wide+tall+hbase+table&subj=Re+Parent+child+relation+go+vertical+horizontal+or+many+tables+ This should help you getting started, feel free to write back if you have any followup questions. J-D On Fri, Feb 18, 2011 at 12:16 AM, Usman Waheed <[email protected]> wrote: > Hi, > > I would like to setup an Hbase table that would provide users the ability to > perform selects only (get and scans). We don't have a need for users to > perform inserts or updates at the moment. But yes i will have to load/insert > the data into the tables before users can perform selects. > > I can have the row key as a composite, having "brand:date:users" where brand > is a 4 letter code for all brands, date is DD-MM-YYYY and users is the > metric (how many people bought a certain brand). This will give me rather > tall table which will have millions of rows and less columns (maybe 2) at > most. > > or > > Would it be better to have a wider table with the row key as users:date only > and have the brands become a column family. There are many brands to track > on a daily basis. People using my table will need to select a particular > brand, a group or all brands to retrieve and display data. > > If i recollect is it recommended to have tall tables if one is not doing > atomic operations? Does a get/scan in Hbase perform any row locking? Having > a tall table means more data can be spread out over regions on different > nodes in my cluster. I have a small test cluster of 3 nodes at the moment. > > I intend to have other metrics (quantity, price etc) and types (brand, > products, campaigns etc). So my table will be gorw fast and have lots of > data. > > If i use the type (brand, campaign, product) as part of the row key then my > inserts will be in the millions over time but if i make the type a column > family then i will end up with wider entries and less rows. > > Thanks, > Usman > > > > > > > -- > Using Opera's revolutionary email client: http://www.opera.com/mail/ >
