[ 
https://issues.apache.org/jira/browse/PDFBOX-2016?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tilman Hausherr resolved PDFBOX-2016.
-------------------------------------

       Resolution: Fixed
    Fix Version/s: 2.0.0
                   1.8.5
         Assignee: Tilman Hausherr

Fixed in rev 1585781 for the trunk and rev 1585782 for the 1.8 branch. Thanks, 
good obversation!

> Stream parsing still incorrect if length value is wrong
> -------------------------------------------------------
>
>                 Key: PDFBOX-2016
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-2016
>             Project: PDFBox
>          Issue Type: Bug
>          Components: Parsing
>    Affects Versions: 1.6.0, 1.8.4
>            Reporter: Andrew Olsen
>            Assignee: Tilman Hausherr
>             Fix For: 1.8.5, 2.0.0
>
>         Attachments: Hello.pdf, Hello_broken.pdf
>
>   Original Estimate: 2h
>  Remaining Estimate: 2h
>
> From issue PDFBOX-1333 - "In 1.7.0 stream parsing in BaseParser was optimized 
> to use length value if available. The advantage is faster parsing and 
> independence of 'endstream' bytes sequences in stream. However the 
> disadvantage is that streams with wrong length values cannot be parsed 
> anymore" - etc. 
> This issue was marked as fixed now that COSStreams can once again be parsed 
> by reading all the way to 'endstream'. However, the resulting COSStream 
> object still contains the expected length, not the true length. When parsing 
> the COSStream with a PDFStreamParser, the call to 
> COSStream#getUnfilteredStream uses getLength() instead of getLengthWritten to 
> limit the amount of data that can be read. This can truncate the stream and 
> means that incorrect length values still lead to missing data, and so limits 
> the usefulness of the last fix. Changing the call to getLengthWritten should 
> solve the problem.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to