Hi everyone,
Ive a cluster of 40 nodes. The input file has 2^18 lines and every line is an
input to a map job. Every node is a quad core and hence I've set
mapred.tasktracker.map/reduce.tasks.maximum to a value greater than 4. The
first 20 nodes are showing hadoop jobs taking 100% but with only one process
running while since its a quad core I would've liked to see 4 java processes
taking 100% (there are 5 java processes on this system but 4 are idle and only
one is taking 100% or 1 cpu). For the last half of the nodes, the cpu usage of
hadoop processes is 0. This is really strange since my map tasks are processing
in a very slow way and I wouldve liked to use all nodes and all the cores.
What could possibly be wrong ? It would really help if anyone could suggest .
thanks
H
Morpheus: Do you believe in fate, Neo?
Neo: No.
Morpheus: Why Not?
Neo: Because I don't like the idea that I'm not in control of my life.