[ 
https://issues.apache.org/jira/browse/HADOOP-5632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12702039#action_12702039
 ] 

Arun C Murthy commented on HADOOP-5632:
---------------------------------------

Well, we need to consider the 'peak' load and not averages... so it might be 
much higher than 2000 heartbeats per second. Also, the heartbeat isn't the only 
RPC coming through to the JobTracker, we are doing at least as many 
'getTaskCompletionEvents' RPCs from the TaskTrackers at best, if enough events 
are available it's actually much much higher number of RPCs (peak load).

It's important to remember that the JobTracker needed some work to handle the 
4000 node cluster (If I remember right, this was even before we committed 
HADOOP-3297 which increased the number of getTaskCompletionEvents). Again, as I 
mentioned previously, my point is that we need to run this at scale before 
committing it...



> Jobtracker leaves tasktrackers underutilized
> --------------------------------------------
>
>                 Key: HADOOP-5632
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5632
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: mapred
>    Affects Versions: 0.18.0, 0.18.1, 0.18.2, 0.18.3, 0.19.0, 0.19.1, 0.20.0
>         Environment: 2x HT 2.8GHz Intel Xeon, 3GB RAM, 4x 250GB HD linux 
> boxes, 100 node cluster
>            Reporter: Khaled Elmeleegy
>         Attachments: hadoop-khaled-tasktracker.10s.uncompress.timeline.pdf, 
> hadoop-khaled-tasktracker.150ms.uncompress.timeline.pdf, jobtracker.patch, 
> jobtracker20.patch
>
>
> For some workloads, the jobtracker doesn't keep all the slots utilized even 
> under heavy load.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to