Using Java 5 will allow the threads of various tasks to take advantage of multiple processors. Just make sure you set you map tasks property to a multiple of the number of processors total. We are running multi-core machines and are seeing good utilization across all cores this way.

Dennis



Gianlorenzo Thione wrote:
Hello everybody,

I'll ask my first question on this forum and hopefully start building more and more understanding of hadoop so that we can eventually contribute actively. In the meanwhile, I have a simple issue/question/suggestion....

I have many multi-core, multi-processor nodes in my cluster and I'd like to be able to run several tasktrackers and datanode per physical machine. I am modifying the startup scripts so that a number of worker JVMs can be started on each node, maxed out at the number of CPUs seen by the kernel.

Since our map jobs are highly CPU intensive it makes sense to run parallel jobs on each node, maximizing the CPU utilization.

Is that something that would make sense to roll back in the scripts for hadoop as well? Anybody else running on multi processor architectures?

Lorenzo Thione
Powerset, Inc.

Reply via email to