[
https://issues.apache.org/jira/browse/HADOOP-5632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12701543#action_12701543
]
Arun C Murthy commented on HADOOP-5632:
---------------------------------------
bq. ?? The sequence number seems strange.
The sequence number is necessary to ensure that lost RPCs do not lead to lost
tasks... for e.g. previously if the response to the heartbeat (containing the
new tasks to be run) was 'lost', the JobTracker didn't realize it until the
'task initialization timeout' which is a fairly long time (default of 10mins).
The TaskTracker would go ahead, ignore the lost RPC, and request tasks
afresh... hence the sequence numbers.
This was the problem Owen was alluding to in
https://issues.apache.org/jira/browse/HADOOP-5632?focusedCommentId=12696297&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#action_12696297.
----
I'd like to sound a cautionary note...
The big win with HADOOP-3136 (i.e. assigning multiple tasks per heartbeat) was
that it significantly reduced the load on the JobTracker, especially on large
clusters (I was running a 4000-node cluster), since it significantly reduced
the number of RPCs that the JobTracker had to handle, especiallly coming from
4000 tasktrackers. This was felt to be very crucial as we attempted to
scale-out our clusters.
A lot of our current problems stem from the coarse-grained locking structure in
the JobTracker where we lock-up the JobTracker before we handle _any_ RPCs
including heartbeats. See HADOOP-869 for more details. The contention on that
single lock is significant with larger-sized clusters, especially when running
short maps... it might be good to consider HADOOP-869 a pre-requisite for the
current proposal.
Overall, I'm not against this proposal, merely stressing that we need to
consider the effect of this proposal on _much_ larger clusters before we commit
to it, and we might have to do more work than just split up the RPCs i.e. fix
locking in the JobTracker.
----
OTOH an intermediate solution might be to fix HADOOP-5129 i.e. send an
heartbeat immediately when the last running map/reduce completes, without
respecting the heartbeat interval per-se. It should get us further ahead.
> Jobtracker leaves tasktrackers underutilized
> --------------------------------------------
>
> Key: HADOOP-5632
> URL: https://issues.apache.org/jira/browse/HADOOP-5632
> Project: Hadoop Core
> Issue Type: Improvement
> Components: mapred
> Affects Versions: 0.18.0, 0.18.1, 0.18.2, 0.18.3, 0.19.0, 0.19.1, 0.20.0
> Environment: 2x HT 2.8GHz Intel Xeon, 3GB RAM, 4x 250GB HD linux
> boxes, 100 node cluster
> Reporter: Khaled Elmeleegy
> Attachments: hadoop-khaled-tasktracker.10s.uncompress.timeline.pdf,
> hadoop-khaled-tasktracker.150ms.uncompress.timeline.pdf, jobtracker.patch,
> jobtracker20.patch
>
>
> For some workloads, the jobtracker doesn't keep all the slots utilized even
> under heavy load.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.