Hello,
I created a map/reduce process by extending the TableMap and
TableReduce API but for some reason
when I run multiple mappers, in the logs its showing that the same
rows are being processed by each Mapper.
When I say logs, I mean in the hadoop task tracker (localhost:50030)
and drilling down into the logs.
Do I need to manually perform a TableSplit or is this supposed to be
done automatically?
If its something I need to do manually, can someone point me to some
sample code?
If its supposed to be automatic and each mapper was supposed to get
its own set of rows,
should I write up a bug for this? I using trunk 0.2.0 on hadoop trunk
0.17.2.
thanks,
Dru
- newbie - map reduce not distributing Dru Jensen
-