You need to check your cluster's Map/Reduce task capacity. i.e. how many Map/Reduce task can run on cluster at once. You can check it on http://JobtrackerServerIP:50030. You should also check total number of map tasks in your job. It should be greater than map task capacity of the cluster.
Intially reduce tasks will be idle till first batch of map task complete. -- Thanks & Regards, Chandra Prakash Bhagtani, On Sat, Sep 12, 2009 at 10:31 AM, himanshu chandola < [email protected]> wrote: > Hi everyone, > Ive a cluster of 40 nodes. The input file has 2^18 lines and every line is > an input to a map job. Every node is a quad core and hence I've set > mapred.tasktracker.map/reduce.tasks.maximum to a value greater than 4. The > first 20 nodes are showing hadoop jobs taking 100% but with only one process > running while since its a quad core I would've liked to see 4 java processes > taking 100% (there are 5 java processes on this system but 4 are idle and > only one is taking 100% or 1 cpu). For the last half of the nodes, the cpu > usage of hadoop processes is 0. This is really strange since my map tasks > are processing in a very slow way and I wouldve liked to use all nodes and > all the cores. > > What could possibly be wrong ? It would really help if anyone could suggest > . > > thanks > > H > > Morpheus: Do you believe in fate, Neo? > Neo: No. > Morpheus: Why Not? > Neo: Because I don't like the idea that I'm not in control of my life. > > > > >
