Thanks for the responses guys. I think you're right that the commit size was just too big, because just running a smaller job does seem to eliminate the issue. Would be interested to know exactly what was going wrong, but now I'm sick of investigating it so maybe next time ;)
Thanks again! jce On Mon, Sep 29, 2014 at 4:12 AM, Markus Jelsma <[email protected]> wrote: > Hi - i don't think the indexing stage is reached at all, judging from the > MapOutputFormat. We sometimes see this happening during the shuffle stage, > some mapred limits need to be adjusted to overcome this, but don't remember > which. But you can always decrease the size of a job and just run more > jobs, it is probably much more efficient in overall throughput because the > shuffle stage is always expensive. > > Markus > > > > -----Original message----- > > From:Talat Uyarer <[email protected]> > > Sent: Monday 29th September 2014 6:57 > > To: [email protected] > > Subject: Re: Solr Indexer Reduce Tasks "fail to report status" > > > > Hi Jonathan, > > > > Sorry for late response. > > > > i guess your commit size to high for your solr server. Maybe you have big > > size webpage. Because of big size page your commits take long time. Can > you > > try decrease your commit size and can you check http content limit ? > > > > Talat > > On Sep 26, 2014 4:37 PM, "Jonathan Cooper-Ellis" <[email protected]> wrote: > > > > > Hi Talat, > > > > > > Thanks for the reply. I looked in the solr logs as well and nothing > jumped > > > out at me. I didn't notice anything interesting in the JobTracker logs > > > either. The logs I included in the original message are the logs from > the > > > TaskTracker when it fails, unless you're talking about a different > task log > > > that I don't know about. > > > > > > Do you have any other ideas? > > > > > > Thanks, > > > jce > > > > > > On Fri, Sep 26, 2014 at 12:41 AM, Talat Uyarer <[email protected]> > wrote: > > > > > > > Hi Jonathan, > > > > > > > > Did you check your solr log file ? Something may go wrong on Solr > side. > > > > Another question did you check your failed attempt task log. This > logs > > > are > > > > useful for debugging > > > > > > > > Talat > > > > On Sep 25, 2014 10:59 PM, "Jonathan Cooper-Ellis" <[email protected]> > wrote: > > > > > > > > > Hello, > > > > > > > > > > I have been running Nutch 1.9 on Hadoop 1.2.1 using the > > > deploy/bin/crawl > > > > > script for a little while with no problems. However, I just > increased > > > the > > > > > scope of the crawl pretty significantly, and now *most* of my > Indexer > > > > jobs > > > > > are failing on the reduce task showing the error "Task > > > > > attempt_201409241419_0046_r_000000_3 failed to report status for > 600 > > > > > seconds. Killing!". From the TT logs, the main issue seems to be > > > "Caused > > > > > by: java.io.IOException: Connection reset by peer". > > > > > > > > > > I found some suggestions that these errors could be caused by > somaxconn > > > > > being too low, so I increased from 128 to 256 on the node running > Solr > > > > and > > > > > the JT and it didn't help. I also bumped the memory for MR tasks > up to > > > > > 1024m from 700-something which doesn't seem to have helped either. > > > > > > > > > > Has anyone seen this before? Or have any idea what could cause > this? > > > > > > > > > > Here is the relevant excerpt from the TT logs: > > > > > > > > > > 2014-09-25 00:40:25,580 WARN org.apache.hadoop.mapred.TaskTracker: > > > > > getMapOutput(attempt_201409241419_0033_m_000018_0,0) failed : > > > > > org.mortbay.jetty.EofException > > > > > at org.mortbay.jetty.HttpGenerator.flush(HttpGenerator.java:791) > > > > > at > > > > > > > > > > > > > > > > > > org.mortbay.jetty.AbstractGenerator$Output.blockForOutput(AbstractGenerator.java:551) > > > > > at > > > > > > > > > > > > > > > > > > org.mortbay.jetty.AbstractGenerator$Output.flush(AbstractGenerator.java:572) > > > > > at > > > > > org.mortbay.jetty.HttpConnection$Output.flush(HttpConnection.java:1012) > > > > > at > > > > > > > > > > > > > > > > > > org.mortbay.jetty.AbstractGenerator$Output.write(AbstractGenerator.java:651) > > > > > at > > > > > > > > > > > > > > > > > > org.mortbay.jetty.AbstractGenerator$Output.write(AbstractGenerator.java:580) > > > > > at > > > > > > > > > > > > > > > > > > org.apache.hadoop.mapred.TaskTracker$MapOutputServlet.doGet(TaskTracker.java:4125) > > > > > at javax.servlet.http.HttpServlet.service(HttpServlet.java:707) > > > > > at javax.servlet.http.HttpServlet.service(HttpServlet.java:820) > > > > > at > > > org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:511) > > > > > at > > > > > > > > > > > > > > > > > > org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1221) > > > > > at > > > > > > > > > > > > > > > > > > org.apache.hadoop.http.HttpServer$QuotingInputFilter.doFilter(HttpServer.java:914) > > > > > at > > > > > > > > > > > > > > > > > > org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212) > > > > > at > > > > > org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:399) > > > > > at > > > > > > > > > > > > > org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216) > > > > > at > > > > > org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182) > > > > > at > > > > > org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:766) > > > > > at > > > org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:450) > > > > > at > > > > > > > > > > > > > > > > > > org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:230) > > > > > at > > > > > org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152) > > > > > at org.mortbay.jetty.Server.handle(Server.java:326) > > > > > at > > > > > org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:542) > > > > > at > > > > > > > > > > > > > > > > > > org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:928) > > > > > at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:549) > > > > > at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:212) > > > > > at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:404) > > > > > at > > > > > > > > > > > > > > > > > > org.mortbay.io.nio.SelectChannelEndPoint.run(SelectChannelEndPoint.java:410) > > > > > at > > > > > > > > > > > > > > > > > > org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:582) > > > > > Caused by: java.io.IOException: Connection reset by peer > > > > > at sun.nio.ch.FileDispatcherImpl.write0(Native Method) > > > > > at sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:47) > > > > > at sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:93) > > > > > at sun.nio.ch.IOUtil.write(IOUtil.java:65) > > > > > at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:487) > > > > > at > org.mortbay.io.nio.ChannelEndPoint.flush(ChannelEndPoint.java:170) > > > > > at > > > > > > > > > > > > > > > > > > org.mortbay.io.nio.SelectChannelEndPoint.flush(SelectChannelEndPoint.java:221) > > > > > at org.mortbay.jetty.HttpGenerator.flush(HttpGenerator.java:725) > > > > > ... 27 more > > > > > > > > > > 2014-09-25 00:40:25,580 WARN org.mortbay.log: Committed before 410 > > > > > getMapOutput(attempt_201409241419_0033_m_000018_0,0) failed : > > > > > org.mortbay.jetty.EofException > > > > > at org.mortbay.jetty.HttpGenerator.flush(HttpGenerator.java:791) > > > > > at > > > > > > > > > > > > > > > > > > org.mortbay.jetty.AbstractGenerator$Output.blockForOutput(AbstractGenerator.java:551) > > > > > at > > > > > > > > > > > > > > > > > > org.mortbay.jetty.AbstractGenerator$Output.flush(AbstractGenerator.java:572) > > > > > at > > > > > org.mortbay.jetty.HttpConnection$Output.flush(HttpConnection.java:1012) > > > > > at > > > > > > > > > > > > > > > > > > org.mortbay.jetty.AbstractGenerator$Output.write(AbstractGenerator.java:651) > > > > > at > > > > > > > > > > > > > > > > > > org.mortbay.jetty.AbstractGenerator$Output.write(AbstractGenerator.java:580) > > > > > at > > > > > > > > > > > > > > > > > > org.apache.hadoop.mapred.TaskTracker$MapOutputServlet.doGet(TaskTracker.java:4125) > > > > > at javax.servlet.http.HttpServlet.service(HttpServlet.java:707) > > > > > at javax.servlet.http.HttpServlet.service(HttpServlet.java:820) > > > > > at > > > org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:511) > > > > > at > > > > > > > > > > > > > > > > > > org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1221) > > > > > at > > > > > > > > > > > > > > > > > > org.apache.hadoop.http.HttpServer$QuotingInputFilter.doFilter(HttpServer.java:914) > > > > > at > > > > > > > > > > > > > > > > > > org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212) > > > > > at > > > > > org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:399) > > > > > at > > > > > > > > > > > > > org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216) > > > > > at > > > > > org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182) > > > > > at > > > > > org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:766) > > > > > at > > > org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:450) > > > > > at > > > > > > > > > > > > > > > > > > org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:230) > > > > > at > > > > > org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152) > > > > > at org.mortbay.jetty.Server.handle(Server.java:326) > > > > > at > > > > > org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:542) > > > > > at > > > > > > > > > > > > > > > > > > org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:928) > > > > > at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:549) > > > > > at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:212) > > > > > at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:404) > > > > > at > > > > > > > > > > > > > > > > > > org.mortbay.io.nio.SelectChannelEndPoint.run(SelectChannelEndPoint.java:410) > > > > > at > > > > > > > > > > > > > > > > > > org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:582) > > > > > Caused by: java.io.IOException: Connection reset by peer > > > > > at sun.nio.ch.FileDispatcherImpl.write0(Native Method) > > > > > at sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:47) > > > > > at sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:93) > > > > > at sun.nio.ch.IOUtil.write(IOUtil.java:65) > > > > > at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:487) > > > > > at > org.mortbay.io.nio.ChannelEndPoint.flush(ChannelEndPoint.java:170) > > > > > at > > > > > > > > > > > > > > > > > > org.mortbay.io.nio.SelectChannelEndPoint.flush(SelectChannelEndPoint.java:221) > > > > > at org.mortbay.jetty.HttpGenerator.flush(HttpGenerator.java:725) > > > > > ... 27 more > > > > > > > > > > 2014-09-25 00:40:25,580 INFO > > > > > org.apache.hadoop.mapred.TaskTracker.clienttrace: src: > > > > 172.31.36.63:50060, > > > > > dest: 172.31.36.65:53836, bytes: 720896, op: MAPRED_SHUFFLE, > cliID: > > > > > attempt_201409241419_0033_m_000018_0, duration: 5555977 > > > > > 2014-09-25 00:40:25,581 ERROR org.mortbay.log: /mapOutput > > > > > java.lang.IllegalStateException: Committed > > > > > at org.mortbay.jetty.Response.resetBuffer(Response.java:1023) > > > > > at org.mortbay.jetty.Response.sendError(Response.java:240) > > > > > at > > > > > > > > > > > > > > > > > > org.apache.hadoop.mapred.TaskTracker$MapOutputServlet.doGet(TaskTracker.java:4162) > > > > > at javax.servlet.http.HttpServlet.service(HttpServlet.java:707) > > > > > at javax.servlet.http.HttpServlet.service(HttpServlet.java:820) > > > > > at > > > org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:511) > > > > > at > > > > > > > > > > > > > > > > > > org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1221) > > > > > at > > > > > > > > > > > > > > > > > > org.apache.hadoop.http.HttpServer$QuotingInputFilter.doFilter(HttpServer.java:914) > > > > > at > > > > > > > > > > > > > > > > > > org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1212) > > > > > at > > > > > org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:399) > > > > > at > > > > > > > > > > > > > org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216) > > > > > at > > > > > org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182) > > > > > at > > > > > org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:766) > > > > > at > > > org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:450) > > > > > at > > > > > > > > > > > > > > > > > > org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:230) > > > > > at > > > > > org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152) > > > > > at org.mortbay.jetty.Server.handle(Server.java:326) > > > > > at > > > > > org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:542) > > > > > at > > > > > > > > > > > > > > > > > > org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:928) > > > > > at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:549) > > > > > at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:212) > > > > > at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:404) > > > > > at > > > > > > > > > > > > > > > > > > org.mortbay.io.nio.SelectChannelEndPoint.run(SelectChannelEndPoint.java:410) > > > > > at > > > > > > > > > > > > > > > > > > org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:582) > > > > > > > > > > > > > > > Best, > > > > > jce > > > > > > > > > > > > > > >

