[ 
https://issues.apache.org/jira/browse/PDFBOX-4501?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tilman Hausherr closed PDFBOX-4501.
-----------------------------------
    Resolution: Duplicate

Duplicate of PDFBOX-4495, fixed a few days ago. Your file displays.

> References numbers in embedded PDF become floats
> ------------------------------------------------
>
>                 Key: PDFBOX-4501
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-4501
>             Project: PDFBox
>          Issue Type: Bug
>            Reporter: Daniel Persson
>            Priority: Major
>         Attachments: float_pointer.patch
>
>
> Hi everyone.
> We found an issue that happens sometimes with smaller producers that create 
> PDF files with embedded advertisements or other articles. 
> For some reason, this embedded makes the library to throw an exception and 
> not read the file. In many cases, we can read most of the pages but just 
> these embedded data will be missing.
> I wrote a little patch that will handle the issue but I don't know how to 
> decode the embedded data so I have not debugged the issue further. I will add 
> a link to the file because it's 124 Mb so not allowed to upload with the 
> issue.
> [https://drive.google.com/file/d/1hQslqtrbIoo5bTmMXgH1NDSYXuvIUOAQ/view?usp=sharing]
> If we could find a solution where the PDF could be read correctly that would 
> be great but the current behavior of not reading it at all is not great.
>  
> ```
> java.io.IOException: expected number, actual=COSFloat\{18446744073221199360} 
> at offset 127766191
>  
> org.apache.pdfbox.pdfparser.BaseParser.parseCOSDictionaryValue(BaseParser.java:166)
>  
> org.apache.pdfbox.pdfparser.BaseParser.parseCOSDictionaryNameValuePair(BaseParser.java:279)
>  
> org.apache.pdfbox.pdfparser.BaseParser.parseCOSDictionary(BaseParser.java:212)
>  org.apache.pdfbox.pdfparser.BaseParser.parseDirObject(BaseParser.java:864)
>  org.apache.pdfbox.pdfparser.COSParser.parseFileObject(COSParser.java:912)
>  
> org.apache.pdfbox.pdfparser.COSParser.parseObjectDynamically(COSParser.java:881)
>  
> org.apache.pdfbox.pdfparser.COSParser.parseObjectDynamically(COSParser.java:801)
>  org.apache.pdfbox.pdfparser.COSParser.parseDictObjects(COSParser.java:761)
>  org.apache.pdfbox.pdfparser.PDFParser.initialParse(PDFParser.java:187)
>  org.apache.pdfbox.pdfparser.PDFParser.parse(PDFParser.java:226)
>  org.apache.pdfbox.pdmodel.PDDocument.load(PDDocument.java:1069)
>  org.apache.pdfbox.pdmodel.PDDocument.load(PDDocument.java:1007)
>  org.apache.pdfbox.debugger.PDFDebugger$12.open(PDFDebugger.java:1272)
>  
> org.apache.pdfbox.debugger.PDFDebugger$DocumentOpener.parse(PDFDebugger.java:1383)
>  org.apache.pdfbox.debugger.PDFDebugger.readPDFFile(PDFDebugger.java:1275)
>  org.apache.pdfbox.debugger.PDFDebugger.readPDFFile(PDFDebugger.java:1252)
>  org.apache.pdfbox.debugger.PDFDebugger.main(PDFDebugger.java:1243)
> ```



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to