Toby DiPasquale wrote:
In short, yes. Hadoop's code takes advantage of multiple native threads and you can tune the level of concurrency in the system by setting mapred.map.tasks and mapred.reduce.tasks to take advantage of multiple cores on the nodes which have them.
More importantly, you should set mapred.tasktracker.tasks.maximum according to the number of cores per node. That parameter determines how many tasks will be run simultaneously per node. Note that, at this point, this parameter is global for the cluster, and not independently configurable per node. Someone with a heterogeneous cluster might be interested in fixing that someday...
Doug