Re: muti-thread mapreduce

2012-12-12 Thread Yang
I think it won't help much, since in a hadoop cluster, people already allocate "SLOTS" to be the number of cores, supposedly the inherent parallelism can be already exploited, since different mappers/reducers are completely independent. On Wed, Dec 12, 2012 at 2:09 AM, Yu Yang wrote: > Dears, >

Re: muti-thread mapreduce

2012-12-12 Thread Harsh J
Exactly - A job is already designed to be properly parallel w.r.t. its input, and this would just add additional overheads of job setup and scheduling. If your per-record processing requires threaded work, consider using the MultithreadedMapper/Reducer classes instead. On Wed, Dec 12, 2012 at 10:5