[
https://issues.apache.org/jira/browse/PDFBOX-4501?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Tilman Hausherr closed PDFBOX-4501.
-----------------------------------
Resolution: Duplicate
Duplicate of PDFBOX-4495, fixed a few days ago. Your file displays.
> References numbers in embedded PDF become floats
> ------------------------------------------------
>
> Key: PDFBOX-4501
> URL: https://issues.apache.org/jira/browse/PDFBOX-4501
> Project: PDFBox
> Issue Type: Bug
> Reporter: Daniel Persson
> Priority: Major
> Attachments: float_pointer.patch
>
>
> Hi everyone.
> We found an issue that happens sometimes with smaller producers that create
> PDF files with embedded advertisements or other articles.
> For some reason, this embedded makes the library to throw an exception and
> not read the file. In many cases, we can read most of the pages but just
> these embedded data will be missing.
> I wrote a little patch that will handle the issue but I don't know how to
> decode the embedded data so I have not debugged the issue further. I will add
> a link to the file because it's 124 Mb so not allowed to upload with the
> issue.
> [https://drive.google.com/file/d/1hQslqtrbIoo5bTmMXgH1NDSYXuvIUOAQ/view?usp=sharing]
> If we could find a solution where the PDF could be read correctly that would
> be great but the current behavior of not reading it at all is not great.
>
> ```
> java.io.IOException: expected number, actual=COSFloat\{18446744073221199360}
> at offset 127766191
>
> org.apache.pdfbox.pdfparser.BaseParser.parseCOSDictionaryValue(BaseParser.java:166)
>
> org.apache.pdfbox.pdfparser.BaseParser.parseCOSDictionaryNameValuePair(BaseParser.java:279)
>
> org.apache.pdfbox.pdfparser.BaseParser.parseCOSDictionary(BaseParser.java:212)
> org.apache.pdfbox.pdfparser.BaseParser.parseDirObject(BaseParser.java:864)
> org.apache.pdfbox.pdfparser.COSParser.parseFileObject(COSParser.java:912)
>
> org.apache.pdfbox.pdfparser.COSParser.parseObjectDynamically(COSParser.java:881)
>
> org.apache.pdfbox.pdfparser.COSParser.parseObjectDynamically(COSParser.java:801)
> org.apache.pdfbox.pdfparser.COSParser.parseDictObjects(COSParser.java:761)
> org.apache.pdfbox.pdfparser.PDFParser.initialParse(PDFParser.java:187)
> org.apache.pdfbox.pdfparser.PDFParser.parse(PDFParser.java:226)
> org.apache.pdfbox.pdmodel.PDDocument.load(PDDocument.java:1069)
> org.apache.pdfbox.pdmodel.PDDocument.load(PDDocument.java:1007)
> org.apache.pdfbox.debugger.PDFDebugger$12.open(PDFDebugger.java:1272)
>
> org.apache.pdfbox.debugger.PDFDebugger$DocumentOpener.parse(PDFDebugger.java:1383)
> org.apache.pdfbox.debugger.PDFDebugger.readPDFFile(PDFDebugger.java:1275)
> org.apache.pdfbox.debugger.PDFDebugger.readPDFFile(PDFDebugger.java:1252)
> org.apache.pdfbox.debugger.PDFDebugger.main(PDFDebugger.java:1243)
> ```
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]