Using Java 5 will allow the threads of various tasks to take advantage
of multiple processors. Just make sure you set you map tasks property
to a multiple of the number of processors total. We are running
multi-core machines and are seeing good utilization across all cores
this way.
Dennis
Gianlorenzo Thione wrote:
Hello everybody,
I'll ask my first question on this forum and hopefully start building
more and more understanding of hadoop so that we can eventually
contribute actively. In the meanwhile, I have a simple
issue/question/suggestion....
I have many multi-core, multi-processor nodes in my cluster and I'd
like to be able to run several tasktrackers and datanode per physical
machine. I am modifying the startup scripts so that a number of worker
JVMs can be started on each node, maxed out at the number of CPUs seen
by the kernel.
Since our map jobs are highly CPU intensive it makes sense to run
parallel jobs on each node, maximizing the CPU utilization.
Is that something that would make sense to roll back in the scripts
for hadoop as well? Anybody else running on multi processor
architectures?
Lorenzo Thione
Powerset, Inc.