One HStore per column family, really? So get the whole family is not expensive?
Thanks From: "Sébastien_Rainville" <[EMAIL PROTECTED]> Subject: Re: performance on getRow and get Date: Tue, 15 Jul 2008 09:08:03 -0400 > Hi Daniel, > > Yes get(row) is more expensive than get(row, column name). Keep in mind that > HBase is column oriented. So when you fetch data from multiple columns it > means that it will need to access multiple files (1 per column family) in > order to get the data for the whole row. > > Sebastien > > > > > On Tue, Jul 15, 2008 at 8:54 AM, Daniel <[EMAIL PROTECTED]> wrote: > > > hi all, > > i'm writting a program to access my hbase table in a MR job. my first > > version is to get different values from get(row,column name), > > and now im changing to get one row each time into a map, and query that map > > instead - for one reduce job. > > i think it would be better to access hbase only once per one reduce > > function, but it seems like the latter version takes a longer time to > > finish > > > > during the reduce job. does this mean get(row, column name) is less > > expensive than get(row) ? > > thanks. > > > > Daniel > >
