[ 
https://issues.apache.org/jira/browse/PDFBOX-4385?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16697832#comment-16697832
 ] 

Tilman Hausherr edited comment on PDFBOX-4385 at 11/24/18 2:07 PM:
-------------------------------------------------------------------

Of course the PDF is invalid. 18446744073430152624 is not a valid page object 
number and it indicates the creator software of your client has a bug. Parsing 
on demand is a strategy which we don't support yet (although it would have its 
advantages), it means parse only what we need, and the part with the bad object 
number is in the structure tree which isn't needed unless you're blind (very 
simplified, there are other uses too). But the structure tree isn't used by 
PDFBox for what we usually do (rendering, text extraction, signing, etc) 
although a basic API exists.


was (Author: tilman):
Of course the PDF is invalid. 18446744073430152624 is not a valid page object 
number and it indicates the creator software of your client has a bug. Parsing 
on demand is a strategy which we don't support, although it has its 
advantages), parse only the stuff we need, and the part with the bad object 
number is in the structure tree which isn't needed unless you're blind (very 
simplified, there are other uses too). But the structure tree isn't used by 
PDFBox for what we usually do (rendering, text extraction, signing, etc) 
although a basic API exists.

> IOException "expected number, actual=COSFloat{18446744073430152624}" when 
> loading PDF 
> --------------------------------------------------------------------------------------
>
>                 Key: PDFBOX-4385
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-4385
>             Project: PDFBox
>          Issue Type: Bug
>          Components: PDModel
>    Affects Versions: 2.0.12
>         Environment: Mac OS 10.14.1
>            Reporter: Kasper Schnack
>            Priority: Major
>
> On a PDF document, which opens fine with Adobe Reader and Preview on Mac OS, 
> the PDDocument.load() method throws the following:
> java.io.IOException: expected number, actual=COSFloat\{18446744073430152624} 
> at offset 33182
>  at 
> org.apache.pdfbox.pdfparser.BaseParser.parseCOSDictionaryValue(BaseParser.java:166)
>  at 
> org.apache.pdfbox.pdfparser.BaseParser.parseCOSDictionaryNameValuePair(BaseParser.java:279)
>  at 
> org.apache.pdfbox.pdfparser.BaseParser.parseCOSDictionary(BaseParser.java:212)
>  at org.apache.pdfbox.pdfparser.BaseParser.parseDirObject(BaseParser.java:862)
>  at org.apache.pdfbox.pdfparser.COSParser.parseFileObject(COSParser.java:905)
>  at 
> org.apache.pdfbox.pdfparser.COSParser.parseObjectDynamically(COSParser.java:874)
>  at 
> org.apache.pdfbox.pdfparser.COSParser.parseObjectDynamically(COSParser.java:794)
>  at org.apache.pdfbox.pdfparser.COSParser.parseDictObjects(COSParser.java:754)
>  at org.apache.pdfbox.pdfparser.PDFParser.initialParse(PDFParser.java:185)
>  at org.apache.pdfbox.pdfparser.PDFParser.parse(PDFParser.java:220)
>  at org.apache.pdfbox.pdmodel.PDDocument.load(PDDocument.java:1160)
>  at org.apache.pdfbox.pdmodel.PDDocument.load(PDDocument.java:1057)
> Sorry the material is sensitive so I can't attach it :(
>  
> However if I cat the file it looks like this around the offset:
> 48 0 obj
> << /Type /StructElem /S /P /P 30 0 R /Pg 2 0 R /K 15 >>
> endobj
> 49 0 obj
> << /Type /StructElem /S /P /P 30 0 R /Pg 2 0 R /K 16 >>
> endobj
> 50 0 obj
> << /Type /StructElem /S /P /P 30 0 R /Pg 2 0 R /K 17 >>
> endobj
> 51 0 obj
> << /Type /StructElem /S /P /P 30 0 R /Pg 2 0 R /K 18 >>
> endobj
> 52 0 obj
> << /Type /StructElem /S /P /P 30 0 R /Pg 18446744073430152624 0 R /K [ 99 0 R
> 100 0 R ] >>
> endobj
> 99 0 obj
> << /Type /StructElem /S /Span /P 52 0 R /Pg 2 0 R /K 19 >>
> endobj
> 100 0 obj



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to