You're most likely not getting _near_ 4.2G written to Solr, the transport protocol is probably cutting that off as indicated by the "early EOF" exception.
It's really hard to justify trying to index 4.2G as a _single_ file. First of all you won't even be able to receive it in Solr after you've given it only 1G of memory even if you get the transport stuff worked out. Second, searching it is totally useless in most cases as it will probably match _everything_. Thirdly, even if it does match something, how are you going to return it to a user? If it's multiple documents in a huge uber-doc you can break it up at ingestion and only send docs to Solr rather than the whole thing. IOW, I think this is a waste of your time. I understand that you're trying to see the limits, but this limit is not a reasonable one to hope to cross. Best, Erick On Mon, Jun 27, 2016 at 6:24 AM, Rajendran, Prabaharan <rajendra...@dnb.com> wrote: > Hi, > > I am trying to index a text file about 4.2 GB in size. This kind of POC to > understand Solr capacity on indexing & searching. > > Here is my Solr configuration > -Xms1024m -Xmx1024m -Xss256k > > java -Dtype=text/csv -Dparams="separator=%09" > -Durl=http://localhost:8983/solr/mycollection/update -jar > ..\example\exampledocs\post.jar ..\example\exampledocs\largefile.txt > > While doing index got error like below, > SimplePostTool: FATAL: IOException while posting data: java.io.IOException: > too many bytes written > > Kindly let me know, if I need to change (increase memory) any solr > configuration to handle this. > > Here is my log file entry, > > ERROR (qtp297811323-14) [ x:collection2] o.a.s.c.SolrCore > org.apache.solr.common.SolrException: CSVLoader: input=null, > line=2815040,can't read line: 2815040 > values={NO LINES AVAILABLE} > at > org.apache.solr.handler.loader.CSVLoaderBase.input_err(CSVLoaderBase.java:317) > at > org.apache.solr.handler.loader.CSVLoaderBase.load(CSVLoaderBase.java:356) > at > org.apache.solr.handler.loader.CSVLoader.load(CSVLoader.java:31) > at > org.apache.solr.handler.UpdateRequestHandler$1.load(UpdateRequestHandler.java:98) > at > org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:74) > at > org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:143) > at org.apache.solr.core.SolrCore.execute(SolrCore.java:2068) > at > org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:669) > at > org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:462) > at > org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:214) > at > org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:179) > at > org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1652) > at > org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:585) > at > org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143) > at > org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:577) > at > org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:223) > at > org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1127) > at > org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:515) > at > org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185) > at > org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1061) > at > org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141) > at > org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:215) > at > org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:110) > at > org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:97) > at org.eclipse.jetty.server.Server.handle(Server.java:499) > at > org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:310) > at > org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:257) > at > org.eclipse.jetty.io.AbstractConnection$2.run(AbstractConnection.java:540) > at > org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:635) > at > org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:555) > at java.lang.Thread.run(Thread.java:745) > Caused by: org.eclipse.jetty.io.EofException: Early EOF > at > org.eclipse.jetty.server.HttpInput$3.noContent(HttpInput.java:506) > at org.eclipse.jetty.server.HttpInput.read(HttpInput.java:124) > at sun.nio.cs.StreamDecoder.readBytes(StreamDecoder.java:283) > at sun.nio.cs.StreamDecoder.implRead(StreamDecoder.java:325) > at sun.nio.cs.StreamDecoder.read(StreamDecoder.java:177) > at java.io.InputStreamReader.read(InputStreamReader.java:184) > at java.io.BufferedReader.fill(BufferedReader.java:154) > at java.io.BufferedReader.read(BufferedReader.java:175) > at > org.apache.solr.internal.csv.ExtendedBufferedReader.read(ExtendedBufferedReader.java:82) > at > org.apache.solr.internal.csv.CSVParser.simpleTokenLexer(CSVParser.java:421) > at > org.apache.solr.internal.csv.CSVParser.nextToken(CSVParser.java:371) > at > org.apache.solr.internal.csv.CSVParser.getLine(CSVParser.java:231) > at > org.apache.solr.handler.loader.CSVLoaderBase.load(CSVLoaderBase.java:353) > ... 29 more > > Thanks, > Prabaharan