[ 
https://issues.apache.org/jira/browse/TIKA-397?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jukka Zitting resolved TIKA-397.
--------------------------------

      Assignee: Jukka Zitting
    Resolution: Duplicate

This was fixed in Tika 0.5 as a side-effect of other changes. Solr trunk has 
already upgraded to a more recent Tika version (see SOLR-1819), so the fix will 
also be included in the next Solr release.

> Parser crashes on very simple file
> ----------------------------------
>
>                 Key: TIKA-397
>                 URL: https://issues.apache.org/jira/browse/TIKA-397
>             Project: Tika
>          Issue Type: Bug
>          Components: parser
>    Affects Versions: 0.4
>         Environment: Solr 1.4 on Ubuntu 9.10.  OpenJDK Runtime Environment 
> (IcedTea6 1.6.1) (6b16-1.6.1-3ubuntu1)
>            Reporter: Ross Keatinge
>            Assignee: Jukka Zitting
>
> Sorry but I can only talk about this from a Solr user's point of view. I'm 
> using Solr's ExtractingRequestHandler (Solr Cell) to index some text files. 
> In general it's working fine but Tika crashes when parsing a text file with 
> with certain upper case short words near the start of the file. I haven't 
> been able to discover the pattern of what works and what doesn't but here's a 
> real simple example.
> A file with just the letters XE and nothing else crashes. If I edit the file 
> and change it to any of XA, XB, XC, XD or XF it works but XE always crashes. 
> Lower case works.
> I discovered this with certain five letter words that unfortunately are very 
> common in my documents.
> Here's the error message from Solr.
> <html><head><title>Apache Tomcat/6.0.20 - Error report</title><style><!--H1 
> {font-family:Tahoma,Arial,sans-serif;color:white;background-color:#525D76;font-size:22px;}
>  H2 
> {font-family:Tahoma,Arial,sans-serif;color:white;background-color:#525D76;font-size:16px;}
>  H3 
> {font-family:Tahoma,Arial,sans-serif;color:white;background-color:#525D76;font-size:14px;}
>  BODY 
> {font-family:Tahoma,Arial,sans-serif;color:black;background-color:white;} B 
> {font-family:Tahoma,Arial,sans-serif;color:white;background-color:#525D76;} P 
> {font-family:Tahoma,Arial,sans-serif;background:white;color:black;font-size:12px;}A
>  {color : black;}A.name {color : black;}HR {color : #525D76;}--></style> 
> </head><body><h1>HTTP Status 500 - org.apache.tika.exception.TikaException: 
> Unexpected RuntimeException from org.apache.tika.parser.txt.txtpar...@a51027
> org.apache.solr.common.SolrException: 
> org.apache.tika.exception.TikaException: Unexpected RuntimeException from 
> org.apache.tika.parser.txt.txtpar...@a51027
>       at 
> org.apache.solr.handler.extraction.ExtractingDocumentLoader.load(ExtractingDocumentLoader.java:211)
>       at 
> org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:54)
>       at 
> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131)
>       at 
> org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.handleRequest(RequestHandlers.java:233)
>       at org.apache.solr.core.SolrCore.execute(SolrCore.java:1316)
>       at 
> org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:338)
>       at 
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:241)
>       at 
> org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
>       at 
> org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
>       at 
> org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233)
>       at 
> org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191)
>       at 
> org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:128)
>       at 
> org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102)
>       at 
> org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109)
>       at 
> org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:293)
>       at 
> org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:849)
>       at 
> org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:583)
>       at 
> org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:454)
>       at java.lang.Thread.run(Thread.java:636)
> Caused by: org.apache.tika.exception.TikaException: Unexpected 
> RuntimeException from org.apache.tika.parser.txt.txtpar...@a51027
>       at 
> org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:121)
>       at 
> org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:105)
>       at 
> org.apache.solr.handler.extraction.ExtractingDocumentLoader.load(ExtractingDocumentLoader.java:190)
>       ... 18 more
> Caused by: java.lang.NullPointerException
>       at java.io.Reader.&lt;init&gt;(Reader.java:78)
>       at java.io.BufferedReader.&lt;init&gt;(BufferedReader.java:93)
>       at java.io.BufferedReader.&lt;init&gt;(BufferedReader.java:108)
>       at org.apache.tika.parser.txt.TXTParser.parse(TXTParser.java:59)
>       at 
> org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:119)
>       ... 20 more
> </h1><HR size="1" noshade="noshade"><p><b>type</b> Status 
> report</p><p><b>message</b> <u>org.apache.tika.exception.TikaException: 
> Unexpected RuntimeException from org.apache.tika.parser.txt.txtpar...@a51027
> org.apache.solr.common.SolrException: 
> org.apache.tika.exception.TikaException: Unexpected RuntimeException from 
> org.apache.tika.parser.txt.txtpar...@a51027
>       at 
> org.apache.solr.handler.extraction.ExtractingDocumentLoader.load(ExtractingDocumentLoader.java:211)
>       at 
> org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:54)
>       at 
> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131)
>       at 
> org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.handleRequest(RequestHandlers.java:233)
>       at org.apache.solr.core.SolrCore.execute(SolrCore.java:1316)
>       at 
> org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:338)
>       at 
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:241)
>       at 
> org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
>       at 
> org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
>       at 
> org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233)
>       at 
> org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191)
>       at 
> org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:128)
>       at 
> org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102)
>       at 
> org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109)
>       at 
> org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:293)
>       at 
> org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:849)
>       at 
> org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:583)
>       at 
> org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:454)
>       at java.lang.Thread.run(Thread.java:636)
> Caused by: org.apache.tika.exception.TikaException: Unexpected 
> RuntimeException from org.apache.tika.parser.txt.txtpar...@a51027
>       at 
> org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:121)
>       at 
> org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:105)
>       at 
> org.apache.solr.handler.extraction.ExtractingDocumentLoader.load(ExtractingDocumentLoader.java:190)
>       ... 18 more
> Caused by: java.lang.NullPointerException
>       at java.io.Reader.&lt;init&gt;(Reader.java:78)
>       at java.io.BufferedReader.&lt;init&gt;(BufferedReader.java:93)
>       at java.io.BufferedReader.&lt;init&gt;(BufferedReader.java:108)
>       at org.apache.tika.parser.txt.TXTParser.parse(TXTParser.java:59)
>       at 
> org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:119)
>       ... 20 more
> </u></p><p><b>description</b> <u>The server encountered an internal error 
> (org.apache.tika.exception.TikaException: Unexpected RuntimeException from 
> org.apache.tika.parser.txt.txtpar...@a51027
> org.apache.solr.common.SolrException: 
> org.apache.tika.exception.TikaException: Unexpected RuntimeException from 
> org.apache.tika.parser.txt.txtpar...@a51027
>       at 
> org.apache.solr.handler.extraction.ExtractingDocumentLoader.load(ExtractingDocumentLoader.java:211)
>       at 
> org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:54)
>       at 
> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131)
>       at 
> org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.handleRequest(RequestHandlers.java:233)
>       at org.apache.solr.core.SolrCore.execute(SolrCore.java:1316)
>       at 
> org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:338)
>       at 
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:241)
>       at 
> org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
>       at 
> org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
>       at 
> org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233)
>       at 
> org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191)
>       at 
> org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:128)
>       at 
> org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102)
>       at 
> org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109)
>       at 
> org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:293)
>       at 
> org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:849)
>       at 
> org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:583)
>       at 
> org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:454)
>       at java.lang.Thread.run(Thread.java:636)
> Caused by: org.apache.tika.exception.TikaException: Unexpected 
> RuntimeException from org.apache.tika.parser.txt.txtpar...@a51027
>       at 
> org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:121)
>       at 
> org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:105)
>       at 
> org.apache.solr.handler.extraction.ExtractingDocumentLoader.load(ExtractingDocumentLoader.java:190)
>       ... 18 more
> Caused by: java.lang.NullPointerException
>       at java.io.Reader.&lt;init&gt;(Reader.java:78)
>       at java.io.BufferedReader.&lt;init&gt;(BufferedReader.java:93)
>       at java.io.BufferedReader.&lt;init&gt;(BufferedReader.java:108)
>       at org.apache.tika.parser.txt.TXTParser.parse(TXTParser.java:59)
>       at 
> org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:119)
>       ... 20 more

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: 
https://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to