Hi all,
I have been trying to figure out why all mappers run only on one machine when I 
have 4 node cluster. Ruduce part is running fine on all 4 nodes correctly. I am 
using 0.20.2. My input file is a large single file (10GB)

Here is my config in mapred-site.xml. I specified map.tasks as 30 but I only se 
one map task and that too only on one machine. Are there any other parameters I 
need to set in order to control uniform distribution of map job?
<configuration>
        <property>
          <name>mapred.job.tracker</name>
          <value>master-hadoop:54311</value>
          <description>The host and port that the MapReduce job tracker runs
          at.  If "local", then jobs are run in-process as a single map
          and reduce task.
          </description>
        </property>
        <property>
          <name>mapred.child.java.opts</name>
          <value>-Xmx4096m</value>
          <description>map heap size for child task</description>
        </property>
        <property>
          <name>mapred.reduce.parallel.copies</name>
          <value>5</value>
          <description></description>
        </property>
        <property>
          <name>mapred.map.tasks</name>
          <value>30</value>
          <description></description>
        </property>
        <property>
          <name>mapred.reduce.tasks</name>
          <value>6</value>
          <description></description>
        </property>
</configuration>

Reply via email to