[
https://issues.apache.org/jira/browse/PDFBOX-4894?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17141134#comment-17141134
]
Carl Grundstrom commented on PDFBOX-4894:
-----------------------------------------
Sure, I'd be glad to test it. Thanks for all your good work on PDFBox!
> Invalid file offsets for PDF files larger than 2G
> -------------------------------------------------
>
> Key: PDFBOX-4894
> URL: https://issues.apache.org/jira/browse/PDFBOX-4894
> Project: PDFBox
> Issue Type: Bug
> Components: Parsing
> Affects Versions: 2.0.20
> Environment: Linux
> Reporter: Carl Grundstrom
> Assignee: Andreas Lehmkühler
> Priority: Major
> Fix For: 2.0.21, 3.0.0 PDFBox
>
>
> An integer is being used to calculate file offsets for COS objects. This
> works fine for small PDF files, but breaks when the PDF file is larger than
> 2G. For many large files (136 out of 216 in my sample set), negative file
> offsets are generated for some of the COS objects due to integer overflow.
> This results in an IOException being thrown in COSParser.java at line 728.
> Note that these negative offsets are not valid object stream references.
> I have fixed the problem in my local copy of the code by modifying
> PDFXrefStreamParser.java starting at line 158.
> Current code:
> {code}
> int offset = 0;
> for(int i = 0; i < w1; i++)
> { offset += (currLine[i + w0] & 0x00ff) << ((w1 - i - 1) * 8); }
> {code}
> New code:
> {code}
> long offset = 0;
> for(int i = 0; i < w1; i++)
> { offset += ((long)(currLine[i + w0] & 0x00ff)) << ((w1 - i - 1) * 8); }
> {code}
> I can submit a sample PDF file if desired (it will be more than 2G in size)
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]