Definitely. HBase is all about data locality. The general grouping of everything is (row+family), stored in row order (so crossing to the next row for the same family is "cheap" but that's only available in a scanner). So, in general, you want to keep things that will be read together in a row+family together, if you need to cross rows, then same family.
Grabbing multiple families is not particularly efficient today, it's really like running separate and sequential (not parallel) reads. There's lots of room for improvement here, some of which will be seen in the 0.20. Stay tuned to HBASE-1249 and related issues for details. JG > -----Original Message----- > From: Wes Chow [mailto:[email protected]] > Sent: Wednesday, April 01, 2009 8:11 AM > To: [email protected] > Subject: Re: mapreduce locality > > > > Jonathan Gray wrote: > > Currently, we cannot be perfect with MR jobs running locally. > > > > We can, and (I believe in 0.19) we do, make an effort to put > > TableInputFormat map tasks on the same nodes as the region is hosted. > From > > there, the actual locations of the storefiles that make up the region > could > > be on any datanode. So it's impossible to ensure all data is local > from the > > Task -> RegionServer -> DataNode. > > > > There would be tremendous value in that case, and other cases like > > HADOOP-4801, that being able to encourage a regions blocks to be co- > hosted > > on the node with the region would unlock. Still hoping something > comes of > > that, unfortunately it's not even on my radar to look into myself. > > > I guess in a sense you could use column families to group data that > would benefit from locality? > > > Wes > > >> -----Original Message----- > >> From: Wes Chow [mailto:[email protected]] > >> Sent: Wednesday, April 01, 2009 6:19 AM > >> To: [email protected] > >> Subject: mapreduce locality > >> > >> > >> When running MapReduce processes with HBase, is it possible to have > >> Hadoop move the job to the machine that contains the relevant > HStore? I > >> thought I read that it does do this at some point, but I'm unable to > >> find that reference at this moment... > >> > >> Wes > >
