At that level of parallelism, you're right that the process overhead would be too high. - Aaron
On Fri, Apr 10, 2009 at 11:36 AM, Sagar Naik <[email protected]> wrote: > > Two things > - multi-threaded is preferred over multi-processes. The process I m > planning is IO bound so I can really take advantage of multi-threads (100 > threads) > - Correct me if I m wrong. The next MR_JOB in the pipeline will have > increased number of splits to process as the number of reducer-outputs > (from prev job) have increased . This leads to increase > in the map-task completion time. > > > > -Sagar > > > Aaron Kimball wrote: > >> Rather than implementing a multi-threaded reducer, why not simply increase >> the number of reducer tasks per machine via >> mapred.tasktracker.reduce.tasks.maximum, and increase the total number of >> reduce tasks per job via mapred.reduce.tasks to ensure that they're all >> filled. This will effectively utilize a higher number of cores. >> >> - Aaron >> >> On Fri, Apr 10, 2009 at 11:12 AM, Sagar Naik <[email protected]> >> wrote: >> >> >> >>> Hi, >>> I would like to implement a Multi-threaded reducer. >>> As per my understanding , the system does not have one coz we expect the >>> output to be sorted. >>> >>> However, in my case I dont need the output sorted. >>> >>> Can u pl point to me any other issues or it would be safe to do so >>> >>> -Sagar >>> >>> >>> >> >> >> >
