[ 
https://issues.apache.org/jira/browse/PDFBOX-4426?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16745275#comment-16745275
 ] 

Tilman Hausherr commented on PDFBOX-4426:
-----------------------------------------

That is a part of the structure tree, which isn't needed unless you want to 
access it (because you're interested in the structure tree, or in a merge). But 
it is still parsed because we parse all.
The easiest way to "repair" the file would be to replace "/Pg 
18446744073177568688 0 R" with the same amount of blanks. Or replace the large 
number with "2" and enough blanks. (Because there are other /Pg 2 0 R 
occurences, we can assume that this object is a page.

> Not parsable pdf document
> -------------------------
>
>                 Key: PDFBOX-4426
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-4426
>             Project: PDFBox
>          Issue Type: Bug
>          Components: Parsing
>    Affects Versions: 2.0.13
>            Reporter: Nico Prenzel
>            Priority: Minor
>         Attachments: PDFBox Bug - PDFBOX-4426.png
>
>
> I've got another not parsable pdf document from our customers.
> Unfortunately, i'am not allowed to post the pdf document, this time.
> Pherhaps the stacktrace is sufficient to fix the parsing...
> IOException expected number, actual=COSFloat\{18446744073177568688} at offset 
> 693140
> org.apache.pdfbox.pdfparser.BaseParser parseCOSDictionaryValue: 166
> org.apache.pdfbox.pdfparser.BaseParser parseCOSDictionaryNameValuePair: 279
> org.apache.pdfbox.pdfparser.BaseParser parseCOSDictionary: 212
> org.apache.pdfbox.pdfparser.BaseParser parseDirObject: 864
> org.apache.pdfbox.pdfparser.COSParser parseFileObject: 904
> org.apache.pdfbox.pdfparser.COSParser parseObjectDynamically: 873
> org.apache.pdfbox.pdfparser.COSParser parseObjectDynamically: 793
> org.apache.pdfbox.pdfparser.COSParser parseDictObjects: 753
> org.apache.pdfbox.pdfparser.PDFParser initialParse: 187
> org.apache.pdfbox.pdfparser.PDFParser parse: 226
> org.apache.pdfbox.pdmodel.PDDocument load: 1200
> org.apache.pdfbox.pdmodel.PDDocument load: 1097
> vlh.Tools.PDF.PDFBoxUtil$1 run: 148



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to