[
https://issues.apache.org/jira/browse/HADOOP-3738?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Spyros Blanas updated HADOOP-3738:
----------------------------------
Attachment: dynamic_heartbeat.patch
> Spawning tasks faster
> ---------------------
>
> Key: HADOOP-3738
> URL: https://issues.apache.org/jira/browse/HADOOP-3738
> Project: Hadoop Core
> Issue Type: Improvement
> Components: mapred
> Reporter: Spyros Blanas
> Priority: Minor
> Attachments: dynamic_heartbeat.patch
>
>
> In the current implementation, tasks are assigned to tasktrackers by adding
> an appropriate action to the heartbeat response list. Each heartbeat response
> can start one task. As the minimum interval between heartbeats is 5 sec (by
> default), if the nodes are strong machines (say, each node has 10 task
> "slots") and the cluster is idle, this means that some tasks are spawned
> after some time (in our example, the last task will be spawned after 45
> seconds).
> This can be significantly improve the end-to-end execution time if most jobs
> are finished in the order of minutes.
> The patch I attach requests from each TaskTracker to reply in 1/5th of the
> regular heartbeat interval time if it was assigned a task in this round,
> making spawning of multiple tasks much more efficient.
> A better approach would be to have each TaskTracker report the number of free
> slots it has (instead of only if it can accept more work or not) and have the
> JobTracker push the appropriate number of tasks in one response, but this
> will require changes in the current communication protocol.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.