Prabhu Joseph created MAPREDUCE-6530:
----------------------------------------
Summary: Jobtracker is slow when more JT UI requests
Key: MAPREDUCE-6530
URL: https://issues.apache.org/jira/browse/MAPREDUCE-6530
Project: Hadoop Map/Reduce
Issue Type: Bug
Affects Versions: 2.5.1
Reporter: Prabhu Joseph
Priority: Blocker
JobTracker is slow when there are huge number of Jobs running and 30
connections were established to info port to view Job status and counters.
hadoop job -list took 4m22.412s
We took Jstack traces and found most of the server threads waiting on
JobTracker object and the thread which has the lock on JobTracker waits for
ResourceBundle object.
"retireJobs" prio=10 tid=0x00007f2345200800 nid=0x11c1 waiting for
monitor entry [0x00007f22e3499000]
java.lang.Thread.State: BLOCKED (on object monitor)
at
org.apache.hadoop.mapreduce.util.ResourceBundles.getValue(ResourceBundles.java:56)
- waiting to lock <0x0000000197cc6218> (a java.lang.Class for
org.apache.hadoop.mapreduce.util.ResourceBundles)
at
org.apache.hadoop.mapreduce.util.ResourceBundles.getCounterName(ResourceBundles.java:89)
at
org.apache.hadoop.mapreduce.counters.FrameworkCounterGroup.localizeCounterName(FrameworkCounterGroup.java:135)
at
org.apache.hadoop.mapreduce.counters.FrameworkCounterGroup.access$000(FrameworkCounterGroup.java:47)
at
org.apache.hadoop.mapreduce.counters.FrameworkCounterGroup$FrameworkCounter.getDisplayName(FrameworkCounterGroup.java:75)
at
org.apache.hadoop.mapred.Counters$Counter.getDisplayName(Counters.java:130)
at org.apache.hadoop.mapred.Counters.incrAllCounters(Counters.java:534)
- locked <0x00000007f8411608> (a org.apache.hadoop.mapred.Counters)
at
org.apache.hadoop.mapred.JobInProgress.incrementTaskCounters(JobInProgress.java:1728)
at
org.apache.hadoop.mapred.JobInProgress.getMapCounters(JobInProgress.java:1669)
at
org.apache.hadoop.mapred.JobTracker$RetireJobs.addToCache(JobTracker.java:657)
- locked <0x000000009644ae08> (a
org.apache.hadoop.mapred.JobTracker$RetireJobs)
at
org.apache.hadoop.mapred.JobTracker$RetireJobs.run(JobTracker.java:769)
- locked <0x00000000964c5550> (a
org.apache.hadoop.mapred.FairScheduler)
- locked <0x000000009644a9d0> (a java.util.Collections$SynchronizedMap)
- locked <0x00000000962ac660> (a org.apache.hadoop.mapred.JobTracker)
at java.lang.Thread.run(Thread.java:745)
The ResourceBundle object is locked most of the time by JT GUI jobtracker_jsp
and does getMapCounters().
"926410165@qtp-1732070199-56" daemon prio=10 tid=0x00007f232c4df000 nid=0x27c0
runnable [0x00007f22db7bf000]
java.lang.Thread.State: RUNNABLE
at java.lang.Throwable.fillInStackTrace(Native Method)
at java.lang.Throwable.fillInStackTrace(Throwable.java:783)
- locked <0x000000061a49ede0> (a java.util.MissingResourceException)
at java.lang.Throwable.<init>(Throwable.java:287)
at java.lang.Exception.<init>(Exception.java:84)
at java.lang.RuntimeException.<init>(RuntimeException.java:80)
at
java.util.MissingResourceException.<init>(MissingResourceException.java:85)
at
java.util.ResourceBundle.throwMissingResourceException(ResourceBundle.java:1499)
at java.util.ResourceBundle.getBundleImpl(ResourceBundle.java:1322)
at java.util.ResourceBundle.getBundle(ResourceBundle.java:1028)
at
org.apache.hadoop.mapreduce.util.ResourceBundles.getBundle(ResourceBundles.java:37)
at
org.apache.hadoop.mapreduce.util.ResourceBundles.getValue(ResourceBundles.java:56)
- locked <0x0000000197cc6218> (a java.lang.Class for
org.apache.hadoop.mapreduce.util.ResourceBundles)
at
org.apache.hadoop.mapreduce.util.ResourceBundles.getCounterName(ResourceBundles.java:89)
at
org.apache.hadoop.mapreduce.counters.FrameworkCounterGroup.localizeCounterName(FrameworkCounterGroup.java:135)
at
org.apache.hadoop.mapreduce.counters.FrameworkCounterGroup.access$000(FrameworkCounterGroup.java:47)
at
org.apache.hadoop.mapreduce.counters.FrameworkCounterGroup$FrameworkCounter.getDisplayName(FrameworkCounterGroup.java:75)
at
org.apache.hadoop.mapred.Counters$Counter.getDisplayName(Counters.java:130)
at org.apache.hadoop.mapred.Counters.incrAllCounters(Counters.java:534)
- locked <0x00000007ed1024b8> (a org.apache.hadoop.mapred.Counters)
at
org.apache.hadoop.mapred.JobInProgress.incrementTaskCounters(JobInProgress.java:1728)
at
org.apache.hadoop.mapred.JobInProgress.getMapCounters(JobInProgress.java:1669)
at org.apache.hadoop.mapred.JSPUtil.generateJobTable(JSPUtil.java:436)
at
org.apache.hadoop.mapred.jobtracker_jsp._jspService(jobtracker_jsp.java:202)
at org.apache.jasper.runtime.HttpJspBase.service(HttpJspBase.java:98)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:820)
Every job updates their counters and all 30 UI clients reading the frequently
updated counters leading to JT slowness.
With no JT UI requests, hadoop job -list completes in seconds.
How to fix JT slowness when there are 30 sessions wants to know the Job status
and counters of huge number of Jobs running at a time.
Is there any workaround like JT UI caching or offloading some part in JT UI
frontpage when load is heavy.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)