[
https://issues.apache.org/jira/browse/PDFBOX-2016?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Tilman Hausherr resolved PDFBOX-2016.
-------------------------------------
Resolution: Fixed
Fix Version/s: 2.0.0
1.8.5
Assignee: Tilman Hausherr
Fixed in rev 1585781 for the trunk and rev 1585782 for the 1.8 branch. Thanks,
good obversation!
> Stream parsing still incorrect if length value is wrong
> -------------------------------------------------------
>
> Key: PDFBOX-2016
> URL: https://issues.apache.org/jira/browse/PDFBOX-2016
> Project: PDFBox
> Issue Type: Bug
> Components: Parsing
> Affects Versions: 1.6.0, 1.8.4
> Reporter: Andrew Olsen
> Assignee: Tilman Hausherr
> Fix For: 1.8.5, 2.0.0
>
> Attachments: Hello.pdf, Hello_broken.pdf
>
> Original Estimate: 2h
> Remaining Estimate: 2h
>
> From issue PDFBOX-1333 - "In 1.7.0 stream parsing in BaseParser was optimized
> to use length value if available. The advantage is faster parsing and
> independence of 'endstream' bytes sequences in stream. However the
> disadvantage is that streams with wrong length values cannot be parsed
> anymore" - etc.
> This issue was marked as fixed now that COSStreams can once again be parsed
> by reading all the way to 'endstream'. However, the resulting COSStream
> object still contains the expected length, not the true length. When parsing
> the COSStream with a PDFStreamParser, the call to
> COSStream#getUnfilteredStream uses getLength() instead of getLengthWritten to
> limit the amount of data that can be read. This can truncate the stream and
> means that incorrect length values still lead to missing data, and so limits
> the usefulness of the last fix. Changing the call to getLengthWritten should
> solve the problem.
--
This message was sent by Atlassian JIRA
(v6.2#6252)