Matt,
You have 2 threads per core, so your Linux box thinks an 8 core box has16 
cores. In my calcs, I tend to take a whole core for TT DN and RS and then a 
thread per slot so you end up w 10 slots per node. Of course memory is also a 
factor.

Note this is only a starting point.you can always tune up. 

Sent from a remote device. Please excuse any typos...

Mike Segel

On Jun 27, 2011, at 11:11 PM, "GOEKE, MATTHEW (AG/1000)" 
<[email protected]> wrote:

> Per node: 4 cores * 2 processes = 8 slots
> Datanode: 1 slot
> Tasktracker: 1 slot
> 
> Therefore max of 6 slots between mappers and reducers.
> 
> Below is part of our mapred-site.xml. The thing to keep in mind is the number 
> of maps is defined by the number of input splits (which is defined by your 
> data) so you only need to worry about setting the maximum number of 
> concurrent processes per node. In this case the property you want to hone in 
> on is mapred.tasktracker.map.tasks.maximum and 
> mapred.tasktracker.reduce.tasks.maximum. Keep in mind there are a LOT of 
> other tuning improvements that can be made but it requires an strong 
> understanding of your job load.
> 
> <configuration>
>  <property>
>    <name>mapred.tasktracker.map.tasks.maximum</name>
>    <value>2</value>
>  </property>
> 
>  <property>
>    <name>mapred.tasktracker.reduce.tasks.maximum</name>
>    <value>1</value>
>  </property>
> 
>  <property>
>    <name>mapred.child.java.opts</name>
>    <value>-Xmx512m</value>
>  </property>
> 
>  <property>
>    <name>mapred.compress.map.output</name>
>    <value>true</value>
>  </property>
> 
>  <property>
>    <name>mapred.output.compress</name>
>    <value>true</value>
>  </property>
> 
> 

Reply via email to