You're most likely not getting _near_ 4.2G written to Solr, the
transport protocol is probably cutting that off as indicated by
the "early EOF" exception.

It's really hard to justify trying to index 4.2G as a _single_ file.
First of all you won't even be able to receive it in Solr after
you've given it only 1G of memory even if you get the
transport stuff worked out. Second, searching it is totally
useless in most cases as it will probably match _everything_.
Thirdly, even if it does match something, how are you going
to return it to a user?

If it's multiple documents in a huge uber-doc you can
break it up at ingestion and only send docs to Solr rather
than the whole thing.

IOW, I think this is a waste of your time. I understand that
you're trying to see the limits, but this limit is not a reasonable
one to hope to cross.

Best,
Erick

On Mon, Jun 27, 2016 at 6:24 AM, Rajendran, Prabaharan
<rajendra...@dnb.com> wrote:
> Hi,
>
> I am trying to index a text file about 4.2 GB in size. This kind of POC to 
> understand Solr capacity on indexing & searching.
>
> Here is my Solr configuration
> -Xms1024m        -Xmx1024m        -Xss256k
>
> java -Dtype=text/csv -Dparams="separator=%09" 
> -Durl=http://localhost:8983/solr/mycollection/update -jar 
> ..\example\exampledocs\post.jar ..\example\exampledocs\largefile.txt
>
> While doing index got error like below,
> SimplePostTool: FATAL: IOException while posting data: java.io.IOException: 
> too many bytes written
>
> Kindly let me know, if I need to change (increase memory) any solr 
> configuration to handle this.
>
> Here is my log file entry,
>
> ERROR (qtp297811323-14) [   x:collection2] o.a.s.c.SolrCore 
> org.apache.solr.common.SolrException: CSVLoader: input=null, 
> line=2815040,can't read line: 2815040
>                 values={NO LINES AVAILABLE}
>                 at 
> org.apache.solr.handler.loader.CSVLoaderBase.input_err(CSVLoaderBase.java:317)
>                 at 
> org.apache.solr.handler.loader.CSVLoaderBase.load(CSVLoaderBase.java:356)
>                 at 
> org.apache.solr.handler.loader.CSVLoader.load(CSVLoader.java:31)
>                 at 
> org.apache.solr.handler.UpdateRequestHandler$1.load(UpdateRequestHandler.java:98)
>                 at 
> org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:74)
>                 at 
> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:143)
>                 at org.apache.solr.core.SolrCore.execute(SolrCore.java:2068)
>                 at 
> org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:669)
>                 at 
> org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:462)
>                 at 
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:214)
>                 at 
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:179)
>                 at 
> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1652)
>                 at 
> org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:585)
>                 at 
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)
>                 at 
> org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:577)
>                 at 
> org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:223)
>                 at 
> org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1127)
>                 at 
> org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:515)
>                 at 
> org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185)
>                 at 
> org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1061)
>                 at 
> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
>                 at 
> org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:215)
>                 at 
> org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:110)
>                 at 
> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:97)
>                 at org.eclipse.jetty.server.Server.handle(Server.java:499)
>                 at 
> org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:310)
>                 at 
> org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:257)
>                 at 
> org.eclipse.jetty.io.AbstractConnection$2.run(AbstractConnection.java:540)
>                 at 
> org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:635)
>                 at 
> org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:555)
>                 at java.lang.Thread.run(Thread.java:745)
> Caused by: org.eclipse.jetty.io.EofException: Early EOF
>                 at 
> org.eclipse.jetty.server.HttpInput$3.noContent(HttpInput.java:506)
>                 at org.eclipse.jetty.server.HttpInput.read(HttpInput.java:124)
>                 at sun.nio.cs.StreamDecoder.readBytes(StreamDecoder.java:283)
>                 at sun.nio.cs.StreamDecoder.implRead(StreamDecoder.java:325)
>                 at sun.nio.cs.StreamDecoder.read(StreamDecoder.java:177)
>                 at java.io.InputStreamReader.read(InputStreamReader.java:184)
>                 at java.io.BufferedReader.fill(BufferedReader.java:154)
>                 at java.io.BufferedReader.read(BufferedReader.java:175)
>                 at 
> org.apache.solr.internal.csv.ExtendedBufferedReader.read(ExtendedBufferedReader.java:82)
>                 at 
> org.apache.solr.internal.csv.CSVParser.simpleTokenLexer(CSVParser.java:421)
>                 at 
> org.apache.solr.internal.csv.CSVParser.nextToken(CSVParser.java:371)
>                 at 
> org.apache.solr.internal.csv.CSVParser.getLine(CSVParser.java:231)
>                 at 
> org.apache.solr.handler.loader.CSVLoaderBase.load(CSVLoaderBase.java:353)
>                 ... 29 more
>
> Thanks,
> Prabaharan

Reply via email to