You understand the hbase data model yes? Each region gets a mapper and each mapper reads the rows for that region feeding it into the map functions. On the output side, each reducer just writes to hbase. The parallelism can support millions of row reads/second.
I don't understand the rest of your question unfortunately. good luck! -ryan On Tue, Oct 5, 2010 at 9:40 PM, William Kang <[email protected]> wrote: > Can you tell me a little about how HBase works with MR? If the MR > source/sink has to go through just ONE region client, then it is not I am > looking for. But if MR can plug directly with the region server containing > specific rows, then it might work. Furthermore, MR is a heavy weight process > with lots of overhead. Ideally, we want something light weight and can get > result fast. Many thanks. > > > William > > On Wed, Oct 6, 2010 at 12:01 AM, Jeff Zhang <[email protected]> wrote: > >> You can incorporate map reduce with hbase for parallel computing. >> >> >> >> On Wed, Oct 6, 2010 at 11:24 AM, William Kang <[email protected]> >> wrote: >> > Hi guys, >> > Is there any project going on co-processing on region servers? Right now, >> we >> > have to transfer all data from region servers to region client after >> query, >> > is that right? This can be slow. Furthermore, the cpus on the region >> servers >> > are not fully used. If we could distribute the computation along with the >> > data on region server, that would be really handy for some problems. Is >> it >> > possible to do so? Many thanks. >> > >> > >> > William >> > >> >> >> >> -- >> Best Regards >> >> Jeff Zhang >> >
