[ 
https://issues.apache.org/jira/browse/TIKA-972?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tyler Palsulich resolved TIKA-972.
----------------------------------
    Resolution: Fixed

Marking as Fixed, since PDFBOX-1512 was fixed in PDFBox 1.8.8 (Tika's current 
version).

> Unexpected RuntimeException from org.apache.tika.parser.pdf.PDFParser .
> -----------------------------------------------------------------------
>
>                 Key: TIKA-972
>                 URL: https://issues.apache.org/jira/browse/TIKA-972
>             Project: Tika
>          Issue Type: Bug
>          Components: parser
>    Affects Versions: 0.9
>         Environment: Core java , Windows server 2003
>            Reporter: Priya Kujur
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> While extracting text from PDF  , Tika throws runtime exception. The 
> exception is not thrown when java code is executed in windows 7 , but when it 
> is executed on Windows server 2003; it is found.
> This is strange but my devlopment environment is windows 7 and production env 
> is Server2003. Java being platform independent, this issue is making me crazy.
> Any kind of help is much appreciated.
> Please check the stack trace:
> java.io.IOException:
>         at org.apache.tika.parser.ParsingReader.read(ParsingReader.java:271)
>         at java.io.BufferedReader.fill(Unknown Source)
>         at java.io.BufferedReader.readLine(Unknown Source)
>         at java.io.BufferedReader.readLine(Unknown Source)
>         at 
> com.servient.utilities.textmanipulation.ReaderUtil.readBuffer(ReaderU
> til.java:39)
>         at 
> com.servient.mapi.metadata.factory.TikaMetaDataExport.processFile(Tik
> aMetaDataExport.java:255)
>         at 
> com.servient.mapi.metadata.factory.BaseMetadataExport.process(BaseMet
> adataExport.java:37)
>         at 
> com.servient.mapi.wrapper.AttachmentWrapper.saveTextMetadataExtract(A
> ttachmentWrapper.java:116)
>         at 
> com.servient.mapi.wrapper.AttachmentWrapper.process(AttachmentWrapper
> .java:40)
>         at 
> com.servient.mapi.wrapper.AttachmentWrapper.<init>(AttachmentWrapper.
> java:36)
>         at 
> com.servient.mapi.wrapper.MessageWrapper.writeCatalog(MessageWrapper.
> java:761)
>         at 
> com.servient.mapi.wrapper.MessageWrapper.writeCatalog(MessageWrapper.
> java:754)
>         at 
> com.servient.mapi.wrapper.MessageWrapper.process(MessageWrapper.java:
> 804)
>         at com.servient.mapi.MAPI.main(MAPI.java:190)
> Caused by: org.apache.tika.exception.TikaException: Unexpected 
> RuntimeException
> from org.apache.tika.parser.pdf.PDFParser@ea0a39
>         at 
> org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:199
> )
>         at 
> org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:197
> )
>         at 
> org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:1
> 35)
>         at 
> org.apache.tika.parser.ParsingReader$ParsingTask.run(ParsingReader.ja
> va:232)
>         at java.lang.Thread.run(Unknown Source)
> Caused by: java.lang.IllegalArgumentException: Comparison method violates its 
> ge
> neral contract!
>         at java.util.TimSort.mergeHi(Unknown Source)
>         at java.util.TimSort.mergeAt(Unknown Source)
>         at java.util.TimSort.mergeCollapse(Unknown Source)
>         at java.util.TimSort.sort(Unknown Source)
>         at java.util.TimSort.sort(Unknown Source)
>         at java.util.Arrays.sort(Unknown Source)
>         at java.util.Collections.sort(Unknown Source)
>         at 
> org.apache.pdfbox.util.PDFTextStripper.writePage(PDFTextStripper.java
> :551)
>         at 
> org.apache.pdfbox.util.PDFTextStripper.processPage(PDFTextStripper.ja
> va:443)
>         at 
> org.apache.pdfbox.util.PDFTextStripper.processPages(PDFTextStripper.j
> ava:366)
>         at 
> org.apache.pdfbox.util.PDFTextStripper.writeText(PDFTextStripper.java
> :322)
>         at org.apache.tika.parser.pdf.PDF2XHTML.process(PDF2XHTML.java:56)
>         at org.apache.tika.parser.pdf.PDFParser.parse(PDFParser.java:89)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to