[
https://issues.apache.org/jira/browse/HADOOP-1245?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12533014
]
Rick Cox commented on HADOOP-1245:
----------------------------------
I've also run into this, and came up with a patch that involved:
- adding a 'maxTasks' value to the TaskTrackerStatus (set by the local
mapred.tasktracker.tasks.maximum)
- modifying JobTracker to track totalTaskCapacity instead of per-node maxTasks,
and to use the particular task tracker's maxTasks value when deciding whether
to assign it another task
If this seems like a reasonable approach, I can do more testing and provide a
patch.
> value for mapred.tasktracker.tasks.maximum taken from two different sources
> ---------------------------------------------------------------------------
>
> Key: HADOOP-1245
> URL: https://issues.apache.org/jira/browse/HADOOP-1245
> Project: Hadoop
> Issue Type: Bug
> Components: mapred
> Affects Versions: 0.12.3
> Reporter: Michael Bieniosek
>
> I want to create a cluster with machines with different numbers of CPUs.
> Consequently, each machine should have a different value for
> mapred.tasktracker.tasks.maximum, since my map tasks are CPU bound.
> However, hadoop uses BOTH the values for mapred.tasktracker.tasks.maximum on
> the jobtracker and the tasktracker.
> When a new job starts up, the jobtracker uses its (single) value for
> mapred.tasktracker.tasks.maximum to assign tasks. This means that each
> tasktracker gets the same number of tasks, regardless of how I configured
> that particular machine.
> After the first task finishes on each tasktracker, the tasktracker will
> request new tasks from the jobtracker according to the tasktracker's value
> for mapred.tasktracker.tasks.maximum. So after the first round of map tasks
> is done, the cluster reverts to a mode that works well for heterogeneous
> clusters.
> The jobtracker should not consult its config for the value of
> mapred.tasktracker.tasks.maximum. It should assign tasks (or allow
> tasktrackers to request tasks) according to each tasktracker's value of
> mapred.tasktracker.tasks.maximum.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.