We use Amazon EC2 machines with 34GB of memory (m2.2xlarge). The Solr heap is 8GB. We have several cores, totaling about 14GB on disk. This configuration allows 100% of the indexes to be in file buffers.
wunder On Aug 26, 2013, at 9:57 AM, Furkan KAMACI wrote: > Hi Walter; > > You said you are caching your documents. What is average Physical Memory > usage of your Solr Nodes? > > > 2013/8/26 Walter Underwood <wun...@wunderwood.org> > >> It looks lik that error happens when reading XML from an HTTP request. The >> XML ends too soon. This should be unrelated to file buffers. >> >> wunder >> >> On Aug 26, 2013, at 9:17 AM, Furkan KAMACI wrote: >> >>> It has a 48 GB of RAM and index size is nearly 100 GB at each node. I >> have >>> CentOS 6.4. While indexing I got that error and I am suspicious about >> that >>> it is because of high percentage of Physical Memory usage. >>> >>> ERROR - 2013-08-21 22:01:30.979; org.apache.solr.common.SolrException; >>> java.lang.RuntimeException: [was class org.eclipse.jetty.io.EofException] >>> early EOF >>> at >>> >> com.ctc.wstx.util.ExceptionUtil.throwRuntimeException(ExceptionUtil.java:18) >>> at com.ctc.wstx.sr.StreamScanner.throwLazyError(StreamScanner.java:731) >>> at >>> >> com.ctc.wstx.sr.BasicStreamReader.safeFinishToken(BasicStreamReader.java:3657) >>> at com.ctc.wstx.sr.BasicStreamReader.getText(BasicStreamReader.java:809) >>> at org.apache.solr.handler.loader.XMLLoader.readDoc(XMLLoader.java:393) >>> at >>> >> org.apache.solr.handler.loader.XMLLoader.processUpdate(XMLLoader.java:245) >>> at org.apache.solr.handler.loader.XMLLoader.load(XMLLoader.java:173) >>> at >>> >> org.apache.solr.handler.UpdateRequestHandler$1.load(UpdateRequestHandler.java:92) >>> at >>> >> org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:74) >>> at >>> >> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135) >>> at org.apache.solr.core.SolrCore.execute(SolrCore.java:1812) >>> at >>> >> org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:639) >>> at >>> >> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:345) >>> at >>> >> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:141) >>> at >>> >> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1307) >>> at >>> >> org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:453) >>> at >>> >> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:137) >>> at >>> >> org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:560) >>> at >>> >> org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:231) >>> at >>> >> org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1072) >>> at >> org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:382) >>> at >>> >> org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:193) >>> at >>> >> org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1006) >>> at >>> >> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:135) >>> at >>> >> org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:255) >>> at >>> >> org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:154) >>> at >>> >> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:116) >>> at org.eclipse.jetty.server.Server.handle(Server.java:365) >>> at >>> >> org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(AbstractHttpConnection.java:485) >>> at >>> >> org.eclipse.jetty.server.BlockingHttpConnection.handleRequest(BlockingHttpConnection.java:53) >>> at >>> >> org.eclipse.jetty.server.AbstractHttpConnection.content(AbstractHttpConnection.java:937) >>> at >>> >> org.eclipse.jetty.server.AbstractHttpConnection$RequestHandler.content(AbstractHttpConnection.java:998) >>> at org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:948) >>> at org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:235) >>> at >>> >> org.eclipse.jetty.server.BlockingHttpConnection.handle(BlockingHttpConnection.java:72) >>> at >>> >> org.eclipse.jetty.server.bio.SocketConnector$ConnectorEndPoint.run(SocketConnector.java:264) >>> at >>> >> org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:608) >>> at >>> >> org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:543) >>> at java.lang.Thread.run(Thread.java:722) >>> Caused by: org.eclipse.jetty.io.EofException: early EOF >>> at org.eclipse.jetty.server.HttpInput.read(HttpInput.java:65) >>> at java.io.InputStream.read(InputStream.java:101) >>> at com.ctc.wstx.io.UTF8Reader.loadMore(UTF8Reader.java:365) >>> at com.ctc.wstx.io.UTF8Reader.read(UTF8Reader.java:110) >>> at com.ctc.wstx.io.MergedReader.read(MergedReader.java:101) >>> at com.ctc.wstx.io.ReaderSource.readInto(ReaderSource.java:84) >>> at >>> >> com.ctc.wstx.io.BranchingReaderSource.readInto(BranchingReaderSource.java:57) >>> at com.ctc.wstx.sr.StreamScanner.loadMore(StreamScanner.java:992) >>> at >>> >> com.ctc.wstx.sr.BasicStreamReader.readTextSecondary(BasicStreamReader.java:4628) >>> at >>> >> com.ctc.wstx.sr.BasicStreamReader.readCoalescedText(BasicStreamReader.java:4126) >>> at >>> >> com.ctc.wstx.sr.BasicStreamReader.finishToken(BasicStreamReader.java:3701) >>> at >>> >> com.ctc.wstx.sr.BasicStreamReader.safeFinishToken(BasicStreamReader.java:3649) >>> ... 36 more >>> >>> >>> >>> 2013/8/26 Walter Underwood <wun...@wunderwood.org> >>> >>>> What is the precise error? What kind of machine? >>>> >>>> File buffers are a robust part of the OS. Unix has had file buffer >> caching >>>> for decades. >>>> >>>> wunder >>>> >>>> On Aug 26, 2013, at 1:37 AM, Furkan KAMACI wrote: >>>> >>>>> Hi Walter; >>>>> >>>>> You are right about performance. However when I index documents on a >>>>> machine that has a high percentage of Physical Memory usage I get EOF >>>>> errors? >>>>> >>>>> >>>>> 2013/8/26 Walter Underwood <wun...@wunderwood.org> >>>>> >>>>>> On Aug 25, 2013, at 1:41 PM, Furkan KAMACI wrote: >>>>>> >>>>>>> Sometimes Physical Memory usage of Solr is over %99 and this may >> cause >>>>>>> problems. Do you run such kind of a command periodically: >>>>>>> >>>>>>> sudo sh -c "sync; echo 3 > /proc/sys/vm/drop_caches" >>>>>>> >>>>>>> to force dropping caches of machine that Solr runs at and avoid >>>> problems? >>>>>> >>>>>> >>>>>> This is a terrible idea. The OS automatically manages the file >> buffers. >>>>>> When they are all used, that is a good thing, because it reduced disk >>>> IO. >>>>>> >>>>>> After this, no files will be cached in RAM. Every single read from a >>>> file >>>>>> will have to go to disk. This will cause very slow performance until >> the >>>>>> files are recached. >>>>>> >>>>>> Recently, I did exactly the opposite to improve performance in our >> Solr >>>>>> installation. Before starting the Solr process, a script reads every >>>> file >>>>>> in the index so that it will already be in file buffers. This avoids >>>>>> several minutes of high disk IO and slow performance after startup. >>>>>> >>>>>> wunder >>>>>> Search Guy, Chegg.com >>>>>> >>>>>> >>>>>> >>>> >>>> -- >>>> Walter Underwood >>>> wun...@wunderwood.org >>>> >>>> >>>> >>>> >> >> -- >> Walter Underwood >> wun...@wunderwood.org >> >> >> >> -- Walter Underwood wun...@wunderwood.org