[ 
https://issues.apache.org/jira/browse/PDFBOX-3626?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15745681#comment-15745681
 ] 

Seva Alekseyev commented on PDFBOX-3626:
----------------------------------------

Sorry about throwing all those trash-but-not-complete-trash documents at you 
guys. They were thrown at me in the first place.

You should see the freak zoo of Office documents that I'm also dealing with. 
Corrupt and almost-corrupt PDFs are a small percentage in my Tika log.

> StackOverflowException on a valid PDF
> -------------------------------------
>
>                 Key: PDFBOX-3626
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-3626
>             Project: PDFBox
>          Issue Type: Bug
>          Components: Parsing
>    Affects Versions: 2.0.3
>         Environment: Windows 7 x64, JVM 1.8.0_101
>            Reporter: Seva Alekseyev
>         Attachments: PDF-01555.PDF
>
>
> On the attached document, which opens fine in Acrobat, PDDocument,load() 
> throws a StackOverflowException:
> Exception in thread "main" java.lang.StackOverflowError
>       at sun.nio.cs.UTF_8$Decoder.decodeLoop(UTF_8.java:412)
>       at java.nio.charset.CharsetDecoder.decode(CharsetDecoder.java:579)
>       at java.nio.charset.CharsetDecoder.decode(CharsetDecoder.java:802)
>       at 
> org.apache.pdfbox.pdfparser.BaseParser.isValidUTF8(BaseParser.java:805)
>       at 
> org.apache.pdfbox.pdfparser.BaseParser.parseCOSName(BaseParser.java:785)
>       at 
> org.apache.pdfbox.pdfparser.BaseParser.parseDirObject(BaseParser.java:905)
>       at 
> org.apache.pdfbox.pdfparser.BaseParser.parseCOSDictionaryValue(BaseParser.java:153)
>       at 
> org.apache.pdfbox.pdfparser.BaseParser.parseCOSDictionaryNameValuePair(BaseParser.java:277)
>       at 
> org.apache.pdfbox.pdfparser.BaseParser.parseCOSDictionary(BaseParser.java:210)
>       at 
> org.apache.pdfbox.pdfparser.BaseParser.parseDirObject(BaseParser.java:885)
>       at 
> org.apache.pdfbox.pdfparser.COSParser.parseFileObject(COSParser.java:772)
>       at 
> org.apache.pdfbox.pdfparser.COSParser.parseObjectDynamically(COSParser.java:741)
>       at 
> org.apache.pdfbox.pdfparser.COSParser.parseObjectDynamically(COSParser.java:672)
>       at org.apache.pdfbox.pdfparser.COSParser.getLength(COSParser.java:897)
>       at 
> org.apache.pdfbox.pdfparser.COSParser.parseCOSStream(COSParser.java:949)
>       at 
> org.apache.pdfbox.pdfparser.COSParser.parseFileObject(COSParser.java:780)
>       at 
> org.apache.pdfbox.pdfparser.COSParser.parseObjectDynamically(COSParser.java:741)
>       at 
> org.apache.pdfbox.pdfparser.COSParser.parseObjectDynamically(COSParser.java:672)
> ...



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to