[ 
https://issues.apache.org/jira/browse/HADOOP-1245?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12533014
 ] 

Rick Cox commented on HADOOP-1245:
----------------------------------

I've also run into this, and came up with a patch that involved:

- adding a 'maxTasks' value to the TaskTrackerStatus (set by the local 
mapred.tasktracker.tasks.maximum)
- modifying JobTracker to track totalTaskCapacity instead of per-node maxTasks, 
and to use the particular task tracker's maxTasks value when deciding whether 
to assign it another task

If this seems like a reasonable approach, I can do more testing and provide a 
patch.

> value for mapred.tasktracker.tasks.maximum taken from two different sources
> ---------------------------------------------------------------------------
>
>                 Key: HADOOP-1245
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1245
>             Project: Hadoop
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.12.3
>            Reporter: Michael Bieniosek
>
> I want to create a cluster with machines with different numbers of CPUs.  
> Consequently, each machine should have a different value for 
> mapred.tasktracker.tasks.maximum, since my map tasks are CPU bound.
> However, hadoop uses BOTH the values for mapred.tasktracker.tasks.maximum on 
> the jobtracker and the tasktracker.  
> When a new job starts up, the jobtracker uses its (single) value for 
> mapred.tasktracker.tasks.maximum to assign tasks.  This means that each 
> tasktracker gets the same number of tasks, regardless of how I configured 
> that particular machine.
> After the first task finishes on each tasktracker, the tasktracker will 
> request new tasks from the jobtracker according to the tasktracker's value 
> for mapred.tasktracker.tasks.maximum.  So after the first round of map tasks 
> is done, the cluster reverts to a mode that works well for heterogeneous 
> clusters.
> The jobtracker should not consult its config for the value of 
> mapred.tasktracker.tasks.maximum.  It should assign tasks (or allow 
> tasktrackers to request tasks) according to each tasktracker's value of 
> mapred.tasktracker.tasks.maximum.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to