Re: newbie - map reduce not distributing

Dru Jensen Wed, 30 Jul 2008 14:03:27 -0700

J-D,

thanks for your quick response. I have 4 mapping processes runningon 3 systems.

Are the same rows being processed 4 times by each mapping processor?According to the logs they are.

When I run a map/reduce against a file, only one row gets logged permapper. Why would this be different for hbase tables?

I would think only one mapping process would process that one row andit would only show up once in only one log.

preferable it would be the same system that has the region.

I only want one row to be processed once. Is there anyway to changethis behavior without running only 1 mapper?


thanks,
Dru

On Jul 30, 2008, at 1:44 PM, Jean-Daniel Cryans wrote:

Dru,
The regions will split when achieving a certain threshold so if youwant
your computing to be distributed, you will have to have more data.

Regards,

J-D
On Wed, Jul 30, 2008 at 4:36 PM, Dru Jensen <[EMAIL PROTECTED]>wrote:
Hello,
I created a map/reduce process by extending the TableMap andTableReduce
API but for some reason
when I run multiple mappers, in the logs its showing that the samerows are
being processed by each Mapper.
When I say logs, I mean in the hadoop task tracker (localhost:50030) and
drilling down into the logs.
Do I need to manually perform a TableSplit or is this supposed tobe done
automatically?
If its something I need to do manually, can someone point me tosome sample
code?
If its supposed to be automatic and each mapper was supposed to getits own
set of rows,
should I write up a bug for this? I using trunk 0.2.0 on hadooptrunk
0.17.2.

thanks,
Dru

Re: newbie - map reduce not distributing

Reply via email to