Hello everybody,
I'll ask my first question on this forum and hopefully start building
more and more understanding of hadoop so that we can eventually
contribute actively. In the meanwhile, I have a simple issue/question/
suggestion....
I have many multi-core, multi-processor nodes in my cluster and I'd
like to be able to run several tasktrackers and datanode per physical
machine. I am modifying the startup scripts so that a number of
worker JVMs can be started on each node, maxed out at the number of
CPUs seen by the kernel.
Since our map jobs are highly CPU intensive it makes sense to run
parallel jobs on each node, maximizing the CPU utilization.
Is that something that would make sense to roll back in the scripts
for hadoop as well? Anybody else running on multi processor
architectures?
Lorenzo Thione
Powerset, Inc.