[ 
https://issues.apache.org/jira/browse/HADOOP-1900?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12539067
 ] 

Amareshwari Sri Ramadasu commented on HADOOP-1900:
--------------------------------------------------

In the patch attached,
The job tracker periodically calculates the heartbeat interval. It looks at 
both cluster size and busyness of jobtracker. If jobtracker is busy, the 
interval is incremented by a busyFactor. If it is not busy for two continuous 
periods, the interval is decremented by the busyFactor.

Map events polling interval is calculated as a function of heartbeat interval 
to skip the recalculation. It is calculated as follows:
polling_interval = heartbeat_interval/3;
if polling_interval < MIN_POLLING_INTERVAL, then polling_interval = 
MIN_POLLING_INTERVAL;
if polling_interval > MAX_POLLING_INTERVAL, then polling_interval = 
MAX_POLLING_INTERVAL;
MapEventsFetcherThread is notified if a reduce task doesnt find map events at 
the tasktracker.

bq.I propose a change to the status message in the heartbeat - the tasktracker 
can compare the current task status with the previous one and if it finds the 
status to be the same, it doesn't send the complete status object to the 
JobTracker, but just a flag saying it is a duplicate or something to that 
effect. That will reduce the data per RPC considerably for long running tasks 
whose statuses don't change frequently and also reduce the processing load on 
the JobTracker.

This will be addressed in another JIRA


> the heartbeat and task event queries interval should be set dynamically by 
> the JobTracker
> -----------------------------------------------------------------------------------------
>
>                 Key: HADOOP-1900
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1900
>             Project: Hadoop
>          Issue Type: Improvement
>          Components: mapred
>            Reporter: Owen O'Malley
>            Assignee: Amareshwari Sri Ramadasu
>         Attachments: patch-1900.txt
>
>
> The JobTracker should scale the intervals that the TaskTrackers use to 
> contact it dynamically, based on how the busy it is and the size of the 
> cluster.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to