[
https://issues.apache.org/jira/browse/MAPREDUCE-2118?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Joydeep Sen Sarma updated MAPREDUCE-2118:
-----------------------------------------
Attachment: mapreduce-2118.1.patch
do not hold JT lock around getSetupCleanupTasks.
This required a change to not call back to the JT (.createTaskEntry) from
TIP.addRunningTask (which forced the caller to hold the JT lock). Now we only
need the JIP lock to get the task from the Job. The change to the JT data
structures (made in JT.createTaskEntry) are made separately (holding the JT
lock).
We looked carefully of the implications of the JT data structures (task/tracker
maps) being potentially out of sync with the state of the JIP itself (JIP
thinks a particular tip/attempt has been scheduled - but the JT will not find
it in it's tables). We were not able to find code paths that were sensitive to
this. It helps that there's only one heartbeat from one tasktracker at a time.
Most of the lookups to find an attempt can be only made in the context of a
heartbeat call from the tasktracker where the attempt is scheduled. by
definition - we are already processing the heartbeat from this tracker at the
time of the divergence in the state of the job and the JT.
It should be possible to extend this strategy to remove JT lock requirements
around other code paths.
> optimize getJobSetupAndCleanupTasks
> ------------------------------------
>
> Key: MAPREDUCE-2118
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-2118
> Project: Hadoop Map/Reduce
> Issue Type: Improvement
> Reporter: Joydeep Sen Sarma
> Attachments: mapreduce-2118.1.patch
>
>
> in every heartbeat, while holding the JobTracker global lock, all jobs are
> scanned for job setup/cleanup, task setup/cleanup. on a large system with
> many trackers (and heartbeats) and many jobs - this becomes the bottleneck
> for JT throughput.
> One possible route may be to rework the code to not require the JT lock while
> asking the JIP whether it has a setup/cleanup task.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.