Hi all
I'm trying to import some text files. I'm mostly following Avi
Rappoport's tutorial. Some of my files cause Solr to crash while
indexing. I've narrowed it down to a very simple example.
I have a file named test.txt with one line. That line is the word
XXBLE and nothing else
This is the command I'm using.
curl "
http://localhost:8080/solr-example/update/extract?literal.id=1&commit=true
"
-F "myfi...@test.txt"
The result is pasted below. Other files work just fine. The problem
seems to be related to the letters B and E. If I change them to
something else or make them lower case then it works. In my real
files, the XX is something else but the result is the same. It's a
common word in the files. I guess for this "quick and dirty" job I'm
doing I could do a bulk replace in the files to make it lower case.
Is there any workaround for this?
Thanks
Ross
<html><head><title>Apache Tomcat/6.0.20 - Error
report</title><style><!--H1
{font-family:Tahoma,Arial,sans-serif;color:white;background-
color:#525D76;font-size:22px;}
H2
{font-family:Tahoma,Arial,sans-serif;color:white;background-
color:#525D76;font-size:16px;}
H3
{font-family:Tahoma,Arial,sans-serif;color:white;background-
color:#525D76;font-size:14px;}
BODY
{font-family:Tahoma,Arial,sans-serif;color:black;background-
color:white;}
B
{font-family:Tahoma,Arial,sans-serif;color:white;background-
color:#525D76;}
P
{font-family:Tahoma,Arial,sans-
serif;background:white;color:black;font-size:12px;}A
{color : black;}A.name {color : black;}HR {color :
#525D76;}--></style> </head><body><h1>HTTP Status 500 -
org.apache.tika.exception.TikaException: Unexpected RuntimeException
from org.apache.tika.parser.txt.txtpar...@19ccba
org.apache.solr.common.SolrException:
org.apache.tika.exception.TikaException: Unexpected RuntimeException
from org.apache.tika.parser.txt.txtpar...@19ccba
at
org
.apache
.solr
.handler
.extraction
.ExtractingDocumentLoader.load(ExtractingDocumentLoader.java:211)
at
org
.apache
.solr
.handler
.ContentStreamHandlerBase
.handleRequestBody(ContentStreamHandlerBase.java:54)
at
org
.apache
.solr
.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:
131)
at
org.apache.solr.core.RequestHandlers
$LazyRequestHandlerWrapper.handleRequest(RequestHandlers.java:233)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:1316)
at
org
.apache
.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:
338)
at
org
.apache
.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:
241)
at
org
.apache
.catalina
.core
.ApplicationFilterChain
.internalDoFilter(ApplicationFilterChain.java:235)
at
org
.apache
.catalina
.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:
206)
at
org
.apache
.catalina
.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233)
at
org
.apache
.catalina
.core.StandardContextValve.invoke(StandardContextValve.java:191)
at
org
.apache
.catalina.core.StandardHostValve.invoke(StandardHostValve.java:128)
at
org
.apache
.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102)
at
org
.apache
.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:
109)
at
org
.apache
.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:293)
at
org
.apache.coyote.http11.Http11Processor.process(Http11Processor.java:
849)
at
org.apache.coyote.http11.Http11Protocol
$Http11ConnectionHandler.process(Http11Protocol.java:583)
at
org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:
454)
at java.lang.Thread.run(Thread.java:636)
Caused by: org.apache.tika.exception.TikaException: Unexpected
RuntimeException from org.apache.tika.parser.txt.txtpar...@19ccba
at
org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:
121)
at
org
.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:
105)
at
org
.apache
.solr
.handler
.extraction
.ExtractingDocumentLoader.load(ExtractingDocumentLoader.java:190)
... 18 more
Caused by: java.lang.NullPointerException
at java.io.Reader.<init>(Reader.java:78)
at java.io.BufferedReader.<init>(BufferedReader.java:
93)
at java.io.BufferedReader.<init>(BufferedReader.java:
108)
at
org.apache.tika.parser.txt.TXTParser.parse(TXTParser.java:59)
at
org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:
119)
... 20 more
</h1><HR size="1" noshade="noshade"><p><b>type</b> Status
report</p><p><b>message</b>
<u>org.apache.tika.exception.TikaException: Unexpected
RuntimeException from org.apache.tika.parser.txt.txtpar...@19ccba
org.apache.solr.common.SolrException:
org.apache.tika.exception.TikaException: Unexpected RuntimeException
from org.apache.tika.parser.txt.txtpar...@19ccba
at
org
.apache
.solr
.handler
.extraction
.ExtractingDocumentLoader.load(ExtractingDocumentLoader.java:211)
at
org
.apache
.solr
.handler
.ContentStreamHandlerBase
.handleRequestBody(ContentStreamHandlerBase.java:54)
at
org
.apache
.solr
.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:
131)
at
org.apache.solr.core.RequestHandlers
$LazyRequestHandlerWrapper.handleRequest(RequestHandlers.java:233)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:1316)
at
org
.apache
.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:
338)
at
org
.apache
.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:
241)
at
org
.apache
.catalina
.core
.ApplicationFilterChain
.internalDoFilter(ApplicationFilterChain.java:235)
at
org
.apache
.catalina
.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:
206)
at
org
.apache
.catalina
.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233)
at
org
.apache
.catalina
.core.StandardContextValve.invoke(StandardContextValve.java:191)
at
org
.apache
.catalina.core.StandardHostValve.invoke(StandardHostValve.java:128)
at
org
.apache
.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102)
at
org
.apache
.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:
109)
at
org
.apache
.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:293)
at
org
.apache.coyote.http11.Http11Processor.process(Http11Processor.java:
849)
at
org.apache.coyote.http11.Http11Protocol
$Http11ConnectionHandler.process(Http11Protocol.java:583)
at
org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:
454)
at java.lang.Thread.run(Thread.java:636)
Caused by: org.apache.tika.exception.TikaException: Unexpected
RuntimeException from org.apache.tika.parser.txt.txtpar...@19ccba
at
org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:
121)
at
org
.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:
105)
at
org
.apache
.solr
.handler
.extraction
.ExtractingDocumentLoader.load(ExtractingDocumentLoader.java:190)
... 18 more
Caused by: java.lang.NullPointerException
at java.io.Reader.<init>(Reader.java:78)
at java.io.BufferedReader.<init>(BufferedReader.java:
93)
at java.io.BufferedReader.<init>(BufferedReader.java:
108)
at
org.apache.tika.parser.txt.TXTParser.parse(TXTParser.java:59)
at
org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:
119)
... 20 more
</u></p><p><b>description</b> <u>The server encountered an internal
error (org.apache.tika.exception.TikaException: Unexpected
RuntimeException from org.apache.tika.parser.txt.txtpar...@19ccba
org.apache.solr.common.SolrException:
org.apache.tika.exception.TikaException: Unexpected RuntimeException
from org.apache.tika.parser.txt.txtpar...@19ccba
at
org
.apache
.solr
.handler
.extraction
.ExtractingDocumentLoader.load(ExtractingDocumentLoader.java:211)
at
org
.apache
.solr
.handler
.ContentStreamHandlerBase
.handleRequestBody(ContentStreamHandlerBase.java:54)
at
org
.apache
.solr
.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:
131)
at
org.apache.solr.core.RequestHandlers
$LazyRequestHandlerWrapper.handleRequest(RequestHandlers.java:233)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:1316)
at
org
.apache
.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:
338)
at
org
.apache
.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:
241)
at
org
.apache
.catalina
.core
.ApplicationFilterChain
.internalDoFilter(ApplicationFilterChain.java:235)
at
org
.apache
.catalina
.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:
206)
at
org
.apache
.catalina
.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233)
at
org
.apache
.catalina
.core.StandardContextValve.invoke(StandardContextValve.java:191)
at
org
.apache
.catalina.core.StandardHostValve.invoke(StandardHostValve.java:128)
at
org
.apache
.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102)
at
org
.apache
.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:
109)
at
org
.apache
.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:293)
at
org
.apache.coyote.http11.Http11Processor.process(Http11Processor.java:
849)
at
org.apache.coyote.http11.Http11Protocol
$Http11ConnectionHandler.process(Http11Protocol.java:583)
at
org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:
454)
at java.lang.Thread.run(Thread.java:636)
Caused by: org.apache.tika.exception.TikaException: Unexpected
RuntimeException from org.apache.tika.parser.txt.txtpar...@19ccba
at
org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:
121)
at
org
.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:
105)
at
org
.apache
.solr
.handler
.extraction
.ExtractingDocumentLoader.load(ExtractingDocumentLoader.java:190)
... 18 more
Caused by: java.lang.NullPointerException
at java.io.Reader.<init>(Reader.java:78)
at java.io.BufferedReader.<init>(BufferedReader.java:
93)
at java.io.BufferedReader.<init>(BufferedReader.java:
108)
at
org.apache.tika.parser.txt.TXTParser.parse(TXTParser.java:59)
at
org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:
119)
... 20 more
) that prevented it from fulfilling this request.</u></p><HR
size="1"
noshade="noshade"><h3>Apache Tomcat/6.0.20</h3></body></html>