[
https://issues.apache.org/jira/browse/HADOOP-3674?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12609490#action_12609490
]
Amar Kamat commented on HADOOP-3674:
------------------------------------
Leitao,
Generally we try to make sure that we dont waste any compute cycle of
tasktrackers. If this is a big performance hit then we might need to rethink on
this. Although we can bias the decision of what to give based on various
parameters. Check HADOOP-2812 and HADOOP-2014 that are somewhat related. Let us
know why you feel that not having a greedy approach works better here.
> dynamic heartbeat interval for the locality-aware task scheduling
> -----------------------------------------------------------------
>
> Key: HADOOP-3674
> URL: https://issues.apache.org/jira/browse/HADOOP-3674
> Project: Hadoop Core
> Issue Type: Wish
> Components: mapred
> Reporter: Leitao Guo
> Priority: Minor
>
> In current hadoop release (0.17.0), there is no special scheduling policy for
> those tasktrackers who have no data for some jobs. So, there would be
> inefficient in some senarios. For example, tasktracker A has the data for a
> job, but tasktracker B, which has no data for this job, sends the heartbeat
> message to the jobtracker for a new task before tasktrack A. The task may be
> scheduled to B instead of A. While Jobtracker has to find a new task for
> tasktracker A when A ask for a new task.
> In this situation, if jobtracker has some reservation policy, such as reserve
> the task for tasktracker A and let B ask for new task in the next heartbeat
> message, that would be more efficient. Because before tasktracker B asking
> for new task the second time, tasktracker A has applied for a new task and
> jobtracker has scheduled the task to A.
> Here is a rough idea to deal with the senario above:
> (1) Jobtracker receives the heartbeat message sent by tasktracker B, which
> has no data for any job.
> (2) Jobtracker send response message to tasktracker B with a new heartbeat
> message interval, but does not schedule new task to B. The new heartbeat
> interval should be shorter the current heartbeat interval, for example,
> current_heartbeat_interval/2.
> (3) Tasktracker B receive the response from jobtracker, and sends another
> heartbeat message for a new task after a period of
> current_heartbeat_interval/2 .
> (4) Jobtracker then find a new task for tasktracker B.
> This is just an primary idea for the improvement of the locality-aware
> scheduling. Any comments are welcome.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.