’v–½“I: org.apache.solr.common.SolrException: 
org.apache.tika.exception.TikaException: TIKA-198: Illegal IOException from 
org.apache.tika.parser.pdf.PDFParser@2ca72c6c
        at 
org.apache.solr.handler.extraction.ExtractingDocumentLoader.load(ExtractingDocumentLoader.java:215)
        at 
org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:54)
        at 
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131)
        at 
org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.handleRequest(RequestHandlers.java:233)
        at org.apache.solr.core.SolrCore.execute(SolrCore.java:1322)
        at 
org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:341)
        at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:244)
        at 
org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
        at 
org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
        at 
com.rondhuit.servlet.ConvDoubleByteSpaceToHalfFilter.doFilter(ConvDoubleByteSpaceToHalfFilter.java:32)
        at 
org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
        at 
org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
        at 
com.rondhuit.servlet.SetCharacterEncodingFilter.doFilter(SetCharacterEncodingFilter.java:105)
        at 
org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
        at 
org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
        at 
org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233)
        at 
org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191)
        at 
org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:127)
        at 
org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102)
        at 
org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109)
        at 
org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:298)
        at 
org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:859)
        at 
org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:588)
        at 
org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:489)
        at java.lang.Thread.run(Thread.java:662)
Caused by: org.apache.tika.exception.TikaException: TIKA-198: Illegal 
IOException from org.apache.tika.parser.pdf.PDFParser@2ca72c6c
        at 
org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:148)
        at 
org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:99)
        at 
org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:112)
        at 
org.apache.solr.handler.extraction.ExtractingDocumentLoader.load(ExtractingDocumentLoader.java:194)
        ... 24 more
Caused by: java.io.IOException: For input string: "00000000-1"
        at 
org.apache.pdfbox.pdfparser.PDFParser.parseXrefTable(PDFParser.java:709)
        at org.apache.pdfbox.pdfparser.PDFParser.parseObject(PDFParser.java:449)
        at org.apache.pdfbox.pdfparser.PDFParser.parse(PDFParser.java:179)
        at org.apache.pdfbox.pdmodel.PDDocument.load(PDDocument.java:847)
        at org.apache.pdfbox.pdmodel.PDDocument.load(PDDocument.java:814)
        at org.apache.tika.parser.pdf.PDFParser.parse(PDFParser.java:63)
        at 
org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:142)

hello.

Through solr cell, my user posted the pdf document which contains CAD image 
only.
At this time, Tika threw Illegal IOException in the attachment.
It seems like an error raised at pdfbox, and pdfbox cannot recognize something 
about XrefTable of the pdf? 
What kind of error is it?If you have any clue, please let me know.

environment:
Solr 3.1.0-dev 2010-09-10 
(Apache Tika 0.8-SNAPSHOT, POI 3.6)

Regards,
Shinichiro Abe

Reply via email to