I cannot seem to fix the issue with increasing tasktracker.http.threads so i've returned it to default for now.
Are there any other hints? Thanks On Friday 06 January 2012 12:29:10 Markus Jelsma wrote: > We see this happening for multiple operations such as setting permissions > for some file. > > We also sometimes see Jetty throwing org.mortbay.jetty.EofException for > getting map outputs. This is sometimes caused by connection reset by peer > or a broken pipe errors. Most of the time it's followed up by a > java.lang.IllegalStateException: Committed error. > > There are no peculiarities in syslog or elsewhere. > > I've not tuned tasktracker.http.threads but dfs.datanode.max.xcievers > instaed. They seem to regulate similar stuff. The default is 40 and we > have 15 reducers and 50 mappers in this case with parallel copies set to > 20. > > Is there an error here? > > On Wednesday 04 January 2012 02:49:26 Harsh J wrote: > > Markus, > > > > What's the permissions on /opt/hadoop/hadoop-0.20.205.0/logs (-la)? Does > > this happen only on a single node? > > > > On 04-Jan-2012, at 3:25 AM, Markus Jelsma wrote: > > > Hi, > > > > > > On our 0.20.205.0 test cluster we sometimes see tasks failing for no > > > clear reason. The task tracker logs show us: > > > > > > 2012-01-03 21:16:27,256 WARN org.apache.hadoop.mapred.TaskLog: Failed > > > to retrieve stdout log for task: attempt_201201031651_0008_m_000233_0 > > > java.io.FileNotFoundException: > > > /opt/hadoop/hadoop-0.20.205.0/libexec/../logs/userlogs/job_201201031651 > > > _0 008/attempt_201201031651_0008_m_000233_0/log.index (No such file or > > > directory) > > > > > > at java.io.FileInputStream.open(Native Method) > > > at java.io.FileInputStream.<init>(FileInputStream.java:120) > > > at > > > > > > org.apache.hadoop.io.SecureIOUtils.openForRead(SecureIOUtils.java:102) > > > > > > at > > > > > > org.apache.hadoop.mapred.TaskLog.getAllLogsFileDetails(TaskLog.java:187 > > > ) > > > > > > at > > > org.apache.hadoop.mapred.TaskLog$Reader.<init>(TaskLog.java:422) > > > at > > > > > > org.apache.hadoop.mapred.TaskLogServlet.printTaskLog(TaskLogServlet.jav > > > a: 81) > > > > > > at > > > > > > org.apache.hadoop.mapred.TaskLogServlet.doGet(TaskLogServlet.java:296) > > > > > > at javax.servlet.http.HttpServlet.service(HttpServlet.java:707) > > > at javax.servlet.http.HttpServlet.service(HttpServlet.java:820) > > > at > > > > > > org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:511) > > > > > > at > > > > > > org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHa > > > nd ler.java:1221) > > > > > > at > > > > > > org.apache.hadoop.http.HttpServer$QuotingInputFilter.doFilter(HttpServe > > > r. java:835) > > > > > > at > > > > > > org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHa > > > nd ler.java:1212) > > > > > > at > > > > > > org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:399 > > > ) > > > > > > at > > > > > > org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java: > > > 21 6) > > > > > > at > > > > > > org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182 > > > ) > > > > > > at > > > > > > org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:766 > > > ) > > > > > > at > > > > > > org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:450) > > > > > > at > > > > > > org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandle > > > rC ollection.java:230) > > > > > > at > > > > > > org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152 > > > ) > > > > > > at org.mortbay.jetty.Server.handle(Server.java:326) > > > at > > > > > > org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:542) > > > > > > at > > > > > > org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConn > > > ec tion.java:928) > > > > > > However, if we inspect the log more closely we actually see it > > > happening several times but only one seems to be thrown simultaneously > > > with the task failing. We see no errors in the datanode's log. > > > > > > I've been looking through the configuration descriptions of 0.20.205.0 > > > but didn't find a setting that could be fix this or is responsible for > > > this. > > > > > > Any hints? Upgrade to 1.0? Patches? > > > Thanks -- Markus Jelsma - CTO - Openindex