I think it won't help much, since in a hadoop cluster, people already
allocate "SLOTS" to be the number of cores, supposedly the inherent
parallelism can be already exploited, since different mappers/reducers are
completely independent.
On Wed, Dec 12, 2012 at 2:09 AM, Yu Yang wrote:
> Dears,
>
Exactly - A job is already designed to be properly parallel w.r.t. its
input, and this would just add additional overheads of job setup and
scheduling. If your per-record processing requires threaded work,
consider using the MultithreadedMapper/Reducer classes instead.
On Wed, Dec 12, 2012 at 10:5