On Thu, Aug 9, 2012 at 10:11 AM, Markus Jelsma <markus.jel...@openindex.io> wrote: > I've increased the connection time out on all 10 Tomcats from 1000ms to > 5000ms. Indexing a larger amount of batches seems to run fine now. This, > however, does not really answer the issue. What is exactly timing out here > and why?
It can be any communication with tomcat for any reason. For example, a commit needs to flush and fsync all segments, applying buffered deletes, etc, then open a new searcher and run any configured warming queries or autowarming. That can take some time. It's even longer if you want to optimize. Or a long GC pause could cause a socket timeout. For the stock jetty server, we set it to 50,000ms, which still may be too short for some things frankly. Here's the jetty documentation for the parameter: """ maxIdleTime: Set the maximum Idle time for a connection, which roughly translates to the Socket.setSoTimeout(int) call, although with NIO implementations other mechanisms may be used to implement the timeout. The max idle time is applied: when waiting for a new request to be received on a connection; when reading the headers and content of a request; when writing the headers and content of a response. Jetty interprets this value as the maximum time between some progress being made on the connection. So if a single byte is read or written, then the timeout (if implemented by jetty) is reset. However, in many instances, the reading/writing is delegated to the JVM, and the semantic is more strictly enforced as the maximum time a single read/write operation can take. Note, that as Jetty supports writes of memory mapped file buffers, then a write may take many 10s of seconds for large content written to a slow device. """ -Yonik http://lucidimagination.com I assume its the forwarding of documents from the `indexing node` to the correct shard leader but with 512 maxThreads it should be fine. > > Any hints? > > Thanks > > > > -----Original message----- >> From:Markus Jelsma <markus.jel...@openindex.io> >> Sent: Wed 08-Aug-2012 00:10 >> To: solr-user@lucene.apache.org >> Subject: RE: null:java.lang.RuntimeException: [was class >> java.net.SocketTimeoutException] null >> >> Jack, >> >> There are no peculiarities in the JVM graphs. Only increase in used threads >> and GC time. Heap space is collected quickly and doesn't suddenly increase. >> There's only 256MB available for the heap but it's fine. >> >> >> Yonik, >> >> I'll increase the time out to five seconds tomorrow and try to reproduce it >> with a low batch size of 32. Juding from what i've seen it should throw an >> error quickly with such a low batch size. However, what is timing out here? >> My client connection to the indexing node or something else that i don't see? >> >> Unfortunately no Jetty here (yet). >> >> Thanks >> Markus >> >> >> -----Original message----- >> > From:Yonik Seeley <yo...@lucidimagination.com> >> > Sent: Tue 07-Aug-2012 23:54 >> > To: solr-user@lucene.apache.org >> > Subject: Re: null:java.lang.RuntimeException: [was class >> > java.net.SocketTimeoutException] null >> > >> > Could this be just a simple case of a socket timeout? Can you raise >> > the timout on request threads in Tomcat? >> > It's a lot easier to reproduce/diagnose stuff like this when people >> > use the stock jetty server shipped with Solr. >> > >> > -Yonik >> > http://lucidimagination.com >> > >> > >> > On Tue, Aug 7, 2012 at 5:39 PM, Markus Jelsma >> > <markus.jel...@openindex.io> wrote: >> > > A signicant detail is the batch size which we set to 64 documents due to >> > > earlier memory limitations. We index segments of roughly 300-500k >> > > records each time. Lowering the batch size to 32 lead to an early >> > > internal server error and the stack trace below. Increasing it to 128 >> > > allowed us to index some more records but it still throws the error >> > > after 200k+ indexed records. >> > > >> > > Increasing it even more to 256 records per batch allowed us to index an >> > > entire segment without errors. >> > > >> > > Another detail is that we do not restart the cluster between indexing >> > > attempts so it seems that something only builds up during indexing >> > > (nothing seems to leak afterwards) and throws an error. >> > > >> > > Any hints? >> > > >> > > Thanks, >> > > Markus >> > > >> > > >> > > >> > > -----Original message----- >> > >> From:Markus Jelsma <markus.jel...@openindex.io> >> > >> Sent: Tue 07-Aug-2012 20:08 >> > >> To: solr-user@lucene.apache.org >> > >> Subject: null:java.lang.RuntimeException: [was class >> > >> java.net.SocketTimeoutException] null >> > >> >> > >> Hello, >> > >> >> > >> We sometimes see the error below in our `master` when indexing. Our >> > >> master is currently the node we send documents to - we've not yet >> > >> implemented CloudSolrServer in Apache Nutch. This causes the indexer to >> > >> crash when using Nutch locally, the task is retried when running on >> > >> Hadoop. We're running it locally in this test set up so there's only >> > >> one indexing thread. >> > >> >> > >> Anyway, for me it's quite a cryptic error because i don't know what >> > >> connection has timed out, i assume a connection from the indexing node >> > >> to some other node in the cluster when it passes a document to the >> > >> correct leader? Each node of the 10 node cluster has the same >> > >> configuration, Tomcat is configured with maxThreads=512 and a time out >> > >> of one second. >> > >> >> > >> We're using today's trunk in this test set up and we cannot reliably >> > >> reproduce the error. We've seen the error before so it's not a very >> > >> recent issue. No errors are found in the other node's logs. >> > >> >> > >> 2012-08-07 17:52:05,260 ERROR [solr.servlet.SolrDispatchFilter] - >> > >> [http-8080-exec-6] - : null:java.lang.RuntimeException: [was class >> > >> java.net.SocketTimeoutException] null >> > >> at >> > >> com.ctc.wstx.util.ExceptionUtil.throwRuntimeException(ExceptionUtil.java:18) >> > >> at >> > >> com.ctc.wstx.sr.StreamScanner.throwLazyError(StreamScanner.java:731) >> > >> at >> > >> com.ctc.wstx.sr.BasicStreamReader.safeFinishToken(BasicStreamReader.java:3657) >> > >> at >> > >> com.ctc.wstx.sr.BasicStreamReader.getText(BasicStreamReader.java:809) >> > >> at >> > >> org.apache.solr.handler.loader.XMLLoader.readDoc(XMLLoader.java:376) >> > >> at >> > >> org.apache.solr.handler.loader.XMLLoader.processUpdate(XMLLoader.java:229) >> > >> at >> > >> org.apache.solr.handler.loader.XMLLoader.load(XMLLoader.java:157) >> > >> at >> > >> org.apache.solr.handler.UpdateRequestHandler$1.load(UpdateRequestHandler.java:92) >> > >> at >> > >> org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:74) >> > >> at >> > >> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:129) >> > >> at org.apache.solr.core.SolrCore.execute(SolrCore.java:1656) >> > >> at >> > >> org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:454) >> > >> at >> > >> io.openindex.solr.servlet.HttpResponseSolrDispatchFilter.doFilter(HttpResponseSolrDispatchFilter.java:219) >> > >> at >> > >> org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235) >> > >> at >> > >> org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206) >> > >> at >> > >> org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233) >> > >> at >> > >> org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191) >> > >> at >> > >> org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:127) >> > >> at >> > >> org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102) >> > >> at >> > >> org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109) >> > >> at >> > >> org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:293) >> > >> at >> > >> org.apache.coyote.http11.Http11NioProcessor.process(Http11NioProcessor.java:889) >> > >> at >> > >> org.apache.coyote.http11.Http11NioProtocol$Http11ConnectionHandler.process(Http11NioProtocol.java:744) >> > >> at >> > >> org.apache.tomcat.util.net.NioEndpoint$SocketProcessor.run(NioEndpoint.java:2274) >> > >> at >> > >> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) >> > >> at >> > >> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) >> > >> at java.lang.Thread.run(Thread.java:662) >> > >> Caused by: java.net.SocketTimeoutException >> > >> at >> > >> org.apache.tomcat.util.net.NioBlockingSelector.read(NioBlockingSelector.java:185) >> > >> at >> > >> org.apache.tomcat.util.net.NioSelectorPool.read(NioSelectorPool.java:229) >> > >> at >> > >> org.apache.tomcat.util.net.NioSelectorPool.read(NioSelectorPool.java:210) >> > >> at >> > >> org.apache.coyote.http11.InternalNioInputBuffer.readSocket(InternalNioInputBuffer.java:643) >> > >> at >> > >> org.apache.coyote.http11.InternalNioInputBuffer.fill(InternalNioInputBuffer.java:945) >> > >> at >> > >> org.apache.coyote.http11.InternalNioInputBuffer$SocketInputBuffer.doR >> > >> ead(InternalNioInputBuffer.java:969) >> > >> at >> > >> org.apache.coyote.http11.filters.ChunkedInputFilter.readBytes(ChunkedInputFilter.java:268) >> > >> at >> > >> org.apache.coyote.http11.filters.ChunkedInputFilter.doRead(ChunkedInputFilter.java:167) >> > >> at >> > >> org.apache.coyote.http11.InternalNioInputBuffer.doRead(InternalNioInputBuffer.java:916) >> > >> at org.apache.coyote.Request.doRead(Request.java:427) >> > >> at >> > >> org.apache.catalina.connector.InputBuffer.realReadBytes(InputBuffer.java:304) >> > >> at >> > >> org.apache.tomcat.util.buf.ByteChunk.substract(ByteChunk.java:419) >> > >> at >> > >> org.apache.catalina.connector.InputBuffer.read(InputBuffer.java:327) >> > >> at >> > >> org.apache.catalina.connector.CoyoteInputStream.read(CoyoteInputStream.java:162) >> > >> at com.ctc.wstx.io.UTF8Reader.loadMore(UTF8Reader.java:365) >> > >> at com.ctc.wstx.io.UTF8Reader.read(UTF8Reader.java:110) >> > >> at com.ctc.wstx.io.MergedReader.read(MergedReader.java:101) >> > >> at com.ctc.wstx.io.ReaderSource.readInto(ReaderSource.java:84) >> > >> at >> > >> com.ctc.wstx.io.BranchingReaderSource.readInto(BranchingReaderSource.java:57) >> > >> at >> > >> com.ctc.wstx.sr.StreamScanner.loadMore(StreamScanner.java:992) >> > >> at >> > >> com.ctc.wstx.sr.BasicStreamReader.readTextSecondary(BasicStreamReader.java:4628) >> > >> at >> > >> com.ctc.wstx.sr.BasicStreamReader.readCoalescedText(BasicStreamReader.java:4126) >> > >> at >> > >> com.ctc.wstx.sr.BasicStreamReader.finishToken(BasicStreamReader.java:3701) >> > >> at >> > >> com.ctc.wstx.sr.BasicStreamReader.safeFinishToken(BasicStreamReader.java:3649) >> > >> ... 24 more >> > >> >> > >> Any thoughts to share? Is this a bug? A misconfiguration? >> > >> >> > >> Thanks, >> > >> Markus >> > >> >> > >>