TaskGraphServlet throws ArrayIndexOutOfBoundsException when progress % > 100%
-----------------------------------------------------------------------------
Key: MAPREDUCE-2088
URL: https://issues.apache.org/jira/browse/MAPREDUCE-2088
Project: Hadoop Map/Reduce
Issue Type: Bug
Components: jobtracker
Affects Versions: 0.20.2
Environment: Hadoop 0.20.2, r911707 running on a linux cluster with 4
worker nodes and LZO compressed map output
Reporter: Age Mooij
I'm currently running a map reduce job in which the reducer inserts data into
HBase. It is a long running job dealing with hundreds of GB of map output data
and 9 out of 10 reducers are currently showing 110+% progress (i.e. the total
job reduce progress is 100% but clickig through to the reducer list shows
reducers with percentages over 100%).
The number of reduce input records is 4,299,991,005
While keeping an eye on it through the job detail page of the JobTracker web
gui, the Jobtracker started generating exceptions every 30 seconds which turned
out to be generated by the TaskGrapServlet that displays reducer progress. The
stacktrace also showed up in the web gui in the area that normally holds the
reducer progress graph. 30 seconds is of course the refresh rate of the status
page.
I guess the root cause here is progress percentages being greater than 100%. I
found issue HADOOP-5210 which reports a cause for progress crossing 100% but it
is marked as fixed in 0.20.1 and I'm running 0.20.2.
One thing that might impact the progress percentages is the fact that we LZO
compress our map outputs. For the above job uncompressed output ("Map output
bytes") was 2,462,412,228,874 which compressed to 694,405,054,632
(FILE_BYTES_WRITTEN/map).
Here's the exception:
2010-09-23 18:03:05,484 ERROR org.mortbay.log: /taskgraph
java.lang.ArrayIndexOutOfBoundsException: 3
at
org.apache.hadoop.mapred.TaskGraphServlet.getReduceAvarageProgresses(TaskGraphServlet.java:199)
at
org.apache.hadoop.mapred.TaskGraphServlet.doGet(TaskGraphServlet.java:131)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:707)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:820)
at
org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:502)
at
org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:363)
at
org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216)
at
org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:181)
at
org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:766)
at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:417)
at
org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:230)
at
org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152)
at org.mortbay.jetty.Server.handle(Server.java:324)
at
org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:534)
at
org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:864)
at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:533)
at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:207)
at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:403)
at
org.mortbay.io.nio.SelectChannelEndPoint.run(SelectChannelEndPoint.java:409)
at
org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:522)
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.