user finer grained locks in JT getCounters implementation
---------------------------------------------------------

                 Key: MAPREDUCE-2114
                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2114
             Project: Hadoop Map/Reduce
          Issue Type: Improvement
          Components: jobtracker
            Reporter: Joydeep Sen Sarma


We are bound on the JobTracker lock on our largest cluster. One pattern i have 
seen is the following:

- JT acquires JobTracker lock - but blocked on JIP lock:

java.lang.Thread.State: BLOCKED (on object monitor)
at 
org.apache.hadoop.mapred.JobInProgress.updateTaskStatus(JobInProgress.java:1028)
- waiting to lock <0x00002aae21092ff8> (a 
org.apache.hadoop.mapred.JobInProgress)
at org.apache.hadoop.mapred.JobTracker.updateTaskStatuses(JobTracker.java:4403)
at org.apache.hadoop.mapred.JobTracker.processHeartbeat(JobTracker.java:3444)
- locked <0x00002aab6ebb6640> (a org.apache.hadoop.mapred.JobTracker)

- the JIP lock is typically held by a getcounters call:

- locked <0x00002aaaf88beff8> (a org.apache.hadoop.mapred.Counters$Group)
at org.apache.hadoop.mapred.Counters.incrAllCounters(Counters.java:445)
- locked <0x00002aaaf88bb948> (a org.apache.hadoop.mapred.Counters)
at 
org.apache.hadoop.mapred.JobInProgress.incrementTaskCounters(JobInProgress.java:1253)
at org.apache.hadoop.mapred.JobInProgress.getCounters(JobInProgress.java:1240)
- locked <0x00002aae21092ff8> (a org.apache.hadoop.mapred.JobInProgress)

the solution seems simple. in order to summarize the counters for all tasks - 
we need to only lock one task's counters at a time. we don't need to lock the 
entire job. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to