Re: Disk Seeks and Column families

Praveen Sripati Sat, 21 Jan 2012 09:49:44 -0800

Thanks for the response.

> The contents of a row stay together like a regular row-oriented database.


> K: row-550/colfam1:50/1309813948188/Put/vlen=2 V: 50
> K: row-550/colfam1:50/1309812287166/Put/vlen=2 V: 50
> K: row-551/colfam1:51/1309813948222/Put/vlen=2 V: 51
> K: row-551/colfam1:51/1309812287200/Put/vlen=2 V: 51
> K: row-552/colfam1:52/1309813948256/Put/vlen=2 V: 52

Is the above statement true for a HFile?

Also from the above example, the data for the column family qualifier are
not adjacent to take advantage of compression (
http://en.wikipedia.org/wiki/Column-oriented_DBMS#Compression). Is this a
proper statement?

Regards,
Praveen

On Sat, Jan 21, 2012 at 9:03 PM, <[email protected]> wrote:

> Have you considered using AggregationProtocol to perform aggregation ?
>
> Thanks
>
>
>
> On Jan 20, 2012, at 11:08 PM, Praveen Sripati <[email protected]>
> wrote:
>
> > Hi,
> >
> > 1) According to the this url (1), HBase performs well for two or three
> > column families. Why is it so?
> >
> > 2) Dump of a HFile, looks like below. The contents of a row stay together
> > like a regular row-oriented database. If the column family has 100 column
> > family qualifiers and is dense then the data for a particular column
> family
> > qualifier is spread wide. If I want to do an aggregation on a particular
> > column identifier, the disk seeks doesn't seems to be much better than a
> > regular row-oriented database.
> >
> > Please correct me if I am wrong.
> >
> > K: row-550/colfam1:50/1309813948188/Put/vlen=2 V: 50
> > K: row-550/colfam1:50/1309812287166/Put/vlen=2 V: 50
> > K: row-551/colfam1:51/1309813948222/Put/vlen=2 V: 51
> > K: row-551/colfam1:51/1309812287200/Put/vlen=2 V: 51
> > K: row-552/colfam1:52/1309813948256/Put/vlen=2 V: 52
> >
> > (1) - http://hbase.apache.org/book/number.of.cfs.html
> >
> > Thanks,
> > Praveen
>

Re: Disk Seeks and Column families

Reply via email to