[ 
https://issues.apache.org/jira/browse/MAPREDUCE-2355?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13002757#comment-13002757
 ] 

Dick King commented on MAPREDUCE-2355:
--------------------------------------

The reason we need this is that if many jobs have short tasks, the job tracker 
can get beat up with too many heartbeats.

I think that the patch should have two pieces.

1: In any one node, we should delay an out-of-band heartbeat that we are 
considering sending but that would otherwise occur too soon after the most 
recent heartbeat, in the hopes of reporting multiple task attempt completions 
in one heartbeat thus reducing the total load placed on the job tracker.  This 
involves compromises, because the node won't get a new task immediately.

2: We should cap the total number of heartbeats over a time interval.  The cap 
and the interval should be configurable.  If that interval is INT and the cap 
is C, we should track the times of the last C heartbeats we sent, and if the 
time T of the oldest one is less than INT ago and we otherwise meet the 
criteria for sending a heartbeat we should unconditionally send one at time T + 
INT rather than immediately.

Since principle 2 may induce a longish delay, perhaps each heartbeat should say 
when the next heartbeat should occur?  This makes this patch a bigger deal 
because up to now all changes could be localized to the TaskTracker but now 
they can't, but it might be worthwhile.

> Add an out of band heartbeat damper
> -----------------------------------
>
>                 Key: MAPREDUCE-2355
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2355
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>          Components: jobtracker
>            Reporter: Owen O'Malley
>            Assignee: Arun C Murthy
>
> We should have a configurable knob to throttle how many out of band 
> heartbeats are sent.

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to