Have you considered using AggregationProtocol to perform aggregation ? Thanks
On Jan 20, 2012, at 11:08 PM, Praveen Sripati <[email protected]> wrote: > Hi, > > 1) According to the this url (1), HBase performs well for two or three > column families. Why is it so? > > 2) Dump of a HFile, looks like below. The contents of a row stay together > like a regular row-oriented database. If the column family has 100 column > family qualifiers and is dense then the data for a particular column family > qualifier is spread wide. If I want to do an aggregation on a particular > column identifier, the disk seeks doesn't seems to be much better than a > regular row-oriented database. > > Please correct me if I am wrong. > > K: row-550/colfam1:50/1309813948188/Put/vlen=2 V: 50 > K: row-550/colfam1:50/1309812287166/Put/vlen=2 V: 50 > K: row-551/colfam1:51/1309813948222/Put/vlen=2 V: 51 > K: row-551/colfam1:51/1309812287200/Put/vlen=2 V: 51 > K: row-552/colfam1:52/1309813948256/Put/vlen=2 V: 52 > > (1) - http://hbase.apache.org/book/number.of.cfs.html > > Thanks, > Praveen
