Re: Mapper runs only on one machine

Steve Lewis Tue, 16 Nov 2010 09:33:54 -0800

Are you sure your input file is splittable - many files (say gzip) are not
and such files must be processed on a single machine


On Tue, Nov 16, 2010 at 9:24 AM, <praveen.pe...@nokia.com> wrote:

>  Hi all,
> I have been trying to figure out why all mappers run only on one machine
> when I have 4 node cluster. Ruduce part is running fine on all 4 nodes
> correctly. I am using 0.20.2. My input file is a large single file (10GB)
>
> Here is my config in mapred-site.xml. I specified map.tasks as 30 but I
> only se one map task and that too only on one machine. Are there any other
> parameters I need to set in order to control uniform distribution of map
> job?
> <configuration>
>         <property>
>           <name>mapred.job.tracker</name>
>            <value>master-hadoop:54311</value>
>           <description>The host and port that the MapReduce job tracker
> runs
>           at.  If "local", then jobs are run in-process as a single map
>           and reduce task.
>           </description>
>         </property>
>         <property>
>           <name>mapred.child.java.opts</name>
>           <value>-Xmx4096m</value>
>           <description>map heap size for child task</description>
>         </property>
>         <property>
>           <name>mapred.reduce.parallel.copies</name>
>           <value>5</value>
>           <description></description>
>         </property>
>         <property>
>           <name>mapred.map.tasks</name>
>           <value>30</value>
>           <description></description>
>         </property>
>         <property>
>           <name>mapred.reduce.tasks</name>
>           <value>6</value>
>           <description></description>
>         </property>
> </configuration>
>
>



-- 
Steven M. Lewis PhD
4221 105th Ave Ne
Kirkland, WA 98033
206-384-1340 (cell)
Institute for Systems Biology
Seattle WA

Re: Mapper runs only on one machine

Reply via email to