[ 
https://issues.apache.org/jira/browse/HADOOP-5632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12701543#action_12701543
 ] 

Arun C Murthy commented on HADOOP-5632:
---------------------------------------

bq. ?? The sequence number seems strange.

The sequence number is necessary to ensure that lost RPCs do not lead to lost 
tasks... for e.g. previously if the response to the heartbeat (containing the 
new tasks to be run) was 'lost', the JobTracker didn't realize it until the 
'task initialization timeout' which is a fairly long time (default of 10mins). 
The TaskTracker would go ahead, ignore the lost RPC, and request tasks 
afresh... hence the sequence numbers.

This was the problem Owen was alluding to in 
https://issues.apache.org/jira/browse/HADOOP-5632?focusedCommentId=12696297&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#action_12696297.

----

I'd like to sound a cautionary note...

The big win with HADOOP-3136 (i.e. assigning multiple tasks per heartbeat) was 
that it significantly reduced the load on the JobTracker, especially on large 
clusters (I was running a 4000-node cluster), since it significantly reduced 
the number of RPCs that the JobTracker had to handle, especiallly coming from 
4000 tasktrackers. This was felt to be very crucial as we attempted to 
scale-out our clusters.

A lot of our current problems stem from the coarse-grained locking structure in 
the JobTracker where we lock-up the JobTracker before we handle _any_ RPCs 
including heartbeats. See HADOOP-869 for more details. The contention on that 
single lock is significant with larger-sized clusters, especially when running 
short maps... it might be good to consider HADOOP-869 a pre-requisite for the 
current proposal.

Overall, I'm not against this proposal, merely stressing that we need to 
consider the effect of this proposal on _much_ larger clusters before we commit 
to it, and we might have to do more work than just split up the RPCs i.e. fix 
locking in the JobTracker. 

----

OTOH an intermediate solution might be to fix HADOOP-5129 i.e. send an 
heartbeat immediately when the last running map/reduce completes, without 
respecting the heartbeat interval per-se. It should get us further ahead.


> Jobtracker leaves tasktrackers underutilized
> --------------------------------------------
>
>                 Key: HADOOP-5632
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5632
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: mapred
>    Affects Versions: 0.18.0, 0.18.1, 0.18.2, 0.18.3, 0.19.0, 0.19.1, 0.20.0
>         Environment: 2x HT 2.8GHz Intel Xeon, 3GB RAM, 4x 250GB HD linux 
> boxes, 100 node cluster
>            Reporter: Khaled Elmeleegy
>         Attachments: hadoop-khaled-tasktracker.10s.uncompress.timeline.pdf, 
> hadoop-khaled-tasktracker.150ms.uncompress.timeline.pdf, jobtracker.patch, 
> jobtracker20.patch
>
>
> For some workloads, the jobtracker doesn't keep all the slots utilized even 
> under heavy load.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to