[ 
https://issues.apache.org/jira/browse/HADOOP-3674?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12609490#action_12609490
 ] 

Amar Kamat commented on HADOOP-3674:
------------------------------------

Leitao,
Generally we try to make sure that we dont waste any compute cycle of 
tasktrackers. If this is a big performance hit then we might need to rethink on 
this. Although we can bias the decision of what to give based on various 
parameters. Check HADOOP-2812 and HADOOP-2014 that are somewhat related. Let us 
know why you feel that not having a greedy approach works better here. 

> dynamic heartbeat interval for the locality-aware task scheduling
> -----------------------------------------------------------------
>
>                 Key: HADOOP-3674
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3674
>             Project: Hadoop Core
>          Issue Type: Wish
>          Components: mapred
>            Reporter: Leitao Guo
>            Priority: Minor
>
> In current hadoop release (0.17.0), there is no special scheduling policy for 
> those tasktrackers who have no data for some jobs. So, there would be 
> inefficient in some senarios. For example, tasktracker A has the data for a 
> job, but tasktracker B, which has no data for this job, sends the heartbeat 
> message to the jobtracker for a new task before tasktrack A. The task may be 
> scheduled to B instead of A. While Jobtracker has to find a new task for 
> tasktracker A when A ask for a new task. 
> In this situation, if jobtracker has some reservation policy, such as reserve 
> the task for tasktracker A and let B ask for new task in the next heartbeat 
> message, that would be more efficient. Because before tasktracker B asking 
> for new task the second time, tasktracker A has applied for a new task and 
> jobtracker has scheduled the task to A.
> Here is a rough idea to deal with the senario above:
> (1) Jobtracker receives the heartbeat message sent by tasktracker B, which 
> has no data for any job.
> (2) Jobtracker send response message to tasktracker B with a new heartbeat 
> message interval, but does not schedule new task to B.  The new heartbeat 
> interval should be shorter the current heartbeat interval, for example, 
> current_heartbeat_interval/2.
> (3) Tasktracker B receive the response from jobtracker, and sends another 
> heartbeat message for a new task after a period of 
> current_heartbeat_interval/2 .
> (4) Jobtracker then find a new task for tasktracker B.
> This is just an primary idea for the improvement of the locality-aware 
> scheduling. Any comments are welcome.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to