[ https://issues.apache.org/jira/browse/PDFBOX-4894?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Andreas Lehmkühler resolved PDFBOX-4894. ---------------------------------------- Resolution: Fixed [~cgrundstrom] Thanks for the fast feedback and the report/fix! > Invalid file offsets for PDF files larger than 2G > ------------------------------------------------- > > Key: PDFBOX-4894 > URL: https://issues.apache.org/jira/browse/PDFBOX-4894 > Project: PDFBox > Issue Type: Bug > Components: Parsing > Affects Versions: 2.0.20 > Environment: Linux > Reporter: Carl Grundstrom > Assignee: Andreas Lehmkühler > Priority: Major > Fix For: 2.0.21, 3.0.0 PDFBox > > > An integer is being used to calculate file offsets for COS objects. This > works fine for small PDF files, but breaks when the PDF file is larger than > 2G. For many large files (136 out of 216 in my sample set), negative file > offsets are generated for some of the COS objects due to integer overflow. > This results in an IOException being thrown in COSParser.java at line 728. > Note that these negative offsets are not valid object stream references. > I have fixed the problem in my local copy of the code by modifying > PDFXrefStreamParser.java starting at line 158. > Current code: > {code} > int offset = 0; > for(int i = 0; i < w1; i++) > { offset += (currLine[i + w0] & 0x00ff) << ((w1 - i - 1) * 8); } > {code} > New code: > {code} > long offset = 0; > for(int i = 0; i < w1; i++) > { offset += ((long)(currLine[i + w0] & 0x00ff)) << ((w1 - i - 1) * 8); } > {code} > I can submit a sample PDF file if desired (it will be more than 2G in size) -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org