[ 
https://issues.apache.org/jira/browse/HADOOP-1245?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12532921
 ] 

Yuri Pradkin commented on HADOOP-1245:
--------------------------------------

I too came across this problem the hard way (our weaker nodes started 
thrashing, because they were running too many jobs.  In my case I don't even 
see it reverting to the desired behavior after the first round of jobs is
finished, - it's always the jobtracker's config value.  Maybe I'm using a 
different version of Hadoop? (it's 0.15 devel from svn). 

If anyone knows a way to run different number of tasks on different boxes, 
please let me know, because from what I see, there is no way to do it, which 
makes our hadoop cluster as lame as the lamest node in it.

> value for mapred.tasktracker.tasks.maximum taken from two different sources
> ---------------------------------------------------------------------------
>
>                 Key: HADOOP-1245
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1245
>             Project: Hadoop
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.12.3
>            Reporter: Michael Bieniosek
>
> I want to create a cluster with machines with different numbers of CPUs.  
> Consequently, each machine should have a different value for 
> mapred.tasktracker.tasks.maximum, since my map tasks are CPU bound.
> However, hadoop uses BOTH the values for mapred.tasktracker.tasks.maximum on 
> the jobtracker and the tasktracker.  
> When a new job starts up, the jobtracker uses its (single) value for 
> mapred.tasktracker.tasks.maximum to assign tasks.  This means that each 
> tasktracker gets the same number of tasks, regardless of how I configured 
> that particular machine.
> After the first task finishes on each tasktracker, the tasktracker will 
> request new tasks from the jobtracker according to the tasktracker's value 
> for mapred.tasktracker.tasks.maximum.  So after the first round of map tasks 
> is done, the cluster reverts to a mode that works well for heterogeneous 
> clusters.
> The jobtracker should not consult its config for the value of 
> mapred.tasktracker.tasks.maximum.  It should assign tasks (or allow 
> tasktrackers to request tasks) according to each tasktracker's value of 
> mapred.tasktracker.tasks.maximum.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to