We see this happening for multiple operations such as setting permissions for some file.
We also sometimes see Jetty throwing org.mortbay.jetty.EofException for getting map outputs. This is sometimes caused by connection reset by peer or a broken pipe errors. Most of the time it's followed up by a java.lang.IllegalStateException: Committed error. There are no peculiarities in syslog or elsewhere. I've not tuned tasktracker.http.threads but dfs.datanode.max.xcievers instaed. They seem to regulate similar stuff. The default is 40 and we have 15 reducers and 50 mappers in this case with parallel copies set to 20. Is there an error here? On Wednesday 04 January 2012 02:49:26 Harsh J wrote: > Markus, > > What's the permissions on /opt/hadoop/hadoop-0.20.205.0/logs (-la)? Does > this happen only on a single node? > > On 04-Jan-2012, at 3:25 AM, Markus Jelsma wrote: > > Hi, > > > > On our 0.20.205.0 test cluster we sometimes see tasks failing for no > > clear reason. The task tracker logs show us: > > > > 2012-01-03 21:16:27,256 WARN org.apache.hadoop.mapred.TaskLog: Failed to > > retrieve stdout log for task: attempt_201201031651_0008_m_000233_0 > > java.io.FileNotFoundException: > > /opt/hadoop/hadoop-0.20.205.0/libexec/../logs/userlogs/job_201201031651_0 > > 008/attempt_201201031651_0008_m_000233_0/log.index (No such file or > > directory) > > > > at java.io.FileInputStream.open(Native Method) > > at java.io.FileInputStream.<init>(FileInputStream.java:120) > > at > > > > org.apache.hadoop.io.SecureIOUtils.openForRead(SecureIOUtils.java:102) > > > > at > > > > org.apache.hadoop.mapred.TaskLog.getAllLogsFileDetails(TaskLog.java:187) > > > > at > > org.apache.hadoop.mapred.TaskLog$Reader.<init>(TaskLog.java:422) > > at > > > > org.apache.hadoop.mapred.TaskLogServlet.printTaskLog(TaskLogServlet.java: > > 81) > > > > at > > > > org.apache.hadoop.mapred.TaskLogServlet.doGet(TaskLogServlet.java:296) > > > > at javax.servlet.http.HttpServlet.service(HttpServlet.java:707) > > at javax.servlet.http.HttpServlet.service(HttpServlet.java:820) > > at > > > > org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:511) > > > > at > > > > org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHand > > ler.java:1221) > > > > at > > > > org.apache.hadoop.http.HttpServer$QuotingInputFilter.doFilter(HttpServer. > > java:835) > > > > at > > > > org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHand > > ler.java:1212) > > > > at > > > > org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:399) > > > > at > > > > org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:21 > > 6) > > > > at > > > > org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182) > > > > at > > > > org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:766) > > > > at > > > > org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:450) > > > > at > > > > org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerC > > ollection.java:230) > > > > at > > > > org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152) > > > > at org.mortbay.jetty.Server.handle(Server.java:326) > > at > > > > org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:542) > > > > at > > > > org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnec > > tion.java:928) > > > > However, if we inspect the log more closely we actually see it happening > > several times but only one seems to be thrown simultaneously with the > > task failing. We see no errors in the datanode's log. > > > > I've been looking through the configuration descriptions of 0.20.205.0 > > but didn't find a setting that could be fix this or is responsible for > > this. > > > > Any hints? Upgrade to 1.0? Patches? > > Thanks -- Markus Jelsma - CTO - Openindex