[
https://issues.apache.org/jira/browse/MAPREDUCE-2235?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Todd Lipcon resolved MAPREDUCE-2235.
------------------------------------
Resolution: Duplicate
Hi Vladimir. I think this was already covered by MAPREDUCE-1354 in trunk. Let
me know if you disagree and we can reopen.
> JobTracker "over-synchronization" makes it hang up in certain cases
> --------------------------------------------------------------------
>
> Key: MAPREDUCE-2235
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-2235
> Project: Hadoop Map/Reduce
> Issue Type: Bug
> Components: jobtracker
> Affects Versions: 0.20.1, 0.20.2, 0.21.0
> Reporter: Vladimir Klimontovich
> Attachments: MAPREDUCE-2235-patch1.txt
>
>
> There is a genaral problem in JobTracker.java code: it's using "this"
> synchronization everywhere so only one method could be executed at one
> moment. When the job submit rate is low (lower then one job in several
> seconds) tracker's working without a problem. When the job rate is high the
> following problem occurs:
> Inside submitJob() JT copies job jar + xml to local filesystem. After that
> it's doing "chmod" on those files. Hadoop does chmod by spawning child
> process. When JT heap is big (like several gigabytes) spawning child process
> takes a lot of time (because java calls fork()) — in our case it's about 1-2
> seconds. So job tracker can't handle high frequency job submits.
> Except of that, as heartbeat() method is also synchronized JT stops to
> process heart-beat as "this" monitor is being held by submit job. That makes
> JT thins that a lot of TaskTrackers are down.
> Following solution could help:
> "chmod" is being called from submitJob() method under following line:
> JobInProgress job = new JobInProgress(jobId, this, this.conf);
> This block could be taken away from synchronized code:
> public JobStatus submitJob(JobID jobId) throws IOException {
> synchronized (this) {
> .... the rest
> }
> //here we're leaving this line outside syncronized code as it doesn't
> relate
> //on state of JobTracker. Also this line
> JobInProgress job = new JobInProgress(jobId, this, this.conf);
> synchronized (this) {
> .... the rest
> }
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.