On Mon, 05 May 2008 11:29:00 -0700, Doug Cutting <[EMAIL PROTECTED]> wrote: > Brice Arnould wrote: >> I was asking myself if it could be a good idea to parallelize some of > the >> alogorithms of Hadoop, such as MergeSorter, for the case a single job of >> run on a multicore system. > > One can already exploit parallelism on a multicore system by using > "pseudo-distributed" mode and increasing > mapred.tasktracker.map.tasks.maximum and > mapred.tasktracker.reduce.tasks.maximum.
> LocalRunner should also someday be enhanced to run multiple maps and > reduces in separate threads, which would be more efficient, since > intermediate data would not need to travel through the loopback network > interface. But I don't see an urgent case for making the sort code > itself multi-threaded, since MapReduce itself performs parallel sorting. Sorry, I really had misunderstood the way it works. Thanks for your explanations, I'm going to look at LocalJobRunner. Brice
