[
https://issues.apache.org/jira/browse/PDFBOX-576?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Vincent Hennebert updated PDFBOX-576:
-------------------------------------
Attachment: fixStreamParsing.diff
Patch fixing the issue
> Errors in Stream Object Parsing
> -------------------------------
>
> Key: PDFBOX-576
> URL: https://issues.apache.org/jira/browse/PDFBOX-576
> Project: PDFBox
> Issue Type: Bug
> Components: Parsing
> Affects Versions: 0.8.0-incubator, 1.0.0
> Reporter: Vincent Hennebert
> Attachments: fixStreamParsing.diff
>
>
> The readUntilEndStream method in org.apache.pdfbox.pdfparser.BaseParser
> doesn't work properly. The read(byte[]) method doesn't guarantee that it will
> fill the buffer, and as long as the buffer hasn't been filled the value of
> nextIdx will be wrong. For example, say the call to read returns 5 bytes in
> the buffer. nextIdx will have value 5, i.e. points to a yet uninitialized
> byte that will be written to the output stream. Same at the next loop
> iteration, and in the end the stream will start with 4 null bytes that don't
> belong to the original one.
> Also, if the stream is terminated by endobj instead of endstream, the
> possible bytes that are at the beginning of the buffer and precede endobj
> won't be written out. The for loop will stop at the end of the buffer instead
> of looping back to the beginning of it if necessary (say, if endobj happens
> to occupy slots 3 to 8 of the buffer).
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.