Assign multiple tasks per TaskTracker heartbeat
-----------------------------------------------
Key: HADOOP-3136
URL: https://issues.apache.org/jira/browse/HADOOP-3136
Project: Hadoop Core
Issue Type: Improvement
Components: mapred
Reporter: Devaraj Das
Fix For: 0.18.0
In today's logic of finding a new task, we assign only one task per heartbeat.
We probably could give the tasktracker multiple tasks subject to the max number
of free slots it has - for maps we could assign it data local tasks. We could
probably run some logic to decide what to give it if we run out of data local
tasks (e.g., tasks from overloaded racks, tasks that have least locality,
etc.). In addition to maps, if it has reduce slots free, we could give it
reduce task(s) as well. Again for reduces we could probably run some logic to
give more tasks to nodes that are closer to nodes running most maps (assuming
data generated is proportional to the number of maps). For e.g., if rack1 has
70% of the input splits, we try to schedule ~70% of the reducers there.
Thoughts?
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.