One HStore per column family, really? So get the whole family is not expensive?

Thanks

From: "Sébastien_Rainville" <[EMAIL PROTECTED]>
Subject: Re: performance on getRow and get
Date: Tue, 15 Jul 2008 09:08:03 -0400

> Hi Daniel,
> 
> Yes get(row) is more expensive than get(row, column name). Keep in mind that
> HBase is column oriented. So when you fetch data from multiple columns it
> means that it will need to access multiple files (1 per column family) in
> order to get the data for the whole row.
> 
> Sebastien
> 
> 
> 
> 
> On Tue, Jul 15, 2008 at 8:54 AM, Daniel <[EMAIL PROTECTED]> wrote:
> 
> > hi all,
> >   i'm writting a program to access my hbase table in a MR job. my first
> > version is to get different values from get(row,column name),
> > and now im changing to get one row each time into a map, and query that map
> > instead - for one reduce job.
> >   i think it would be better to access hbase only once per one reduce
> > function, but it seems like the latter version takes a longer time to
> > finish
> >
> > during the reduce job. does this mean get(row, column name) is less
> > expensive than get(row) ?
> >  thanks.
> >
> > Daniel
> >

Reply via email to