[
https://issues.apache.org/jira/browse/PDFBOX-4097?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Andreas Lehmkühler reassigned PDFBOX-4097:
------------------------------------------
Assignee: Andreas Lehmkühler
> Compressed object will lost when brute force search failed to handle
> compressed streams
> ---------------------------------------------------------------------------------------
>
> Key: PDFBOX-4097
> URL: https://issues.apache.org/jira/browse/PDFBOX-4097
> Project: PDFBox
> Issue Type: Bug
> Components: Parsing
> Affects Versions: 2.0.8
> Reporter: Cheng Zhong
> Assignee: Andreas Lehmkühler
> Priority: Major
> Attachments: 奥美医疗-IPO.pdf
>
>
> Compressed object described in cross-reference streams will lost when brute
> force search failed to handle such streams.
> The attached PDF has an object 1336, but it had a offset that referenced to
> object 1828. The inconsistency led to a brute force search. (Introduced by
> *COSParser.checkXrefOffsets*)
> During the search (in *bfSearchForObjStreams*), Object stream 1828, 1829,
> 1830 failed to decompress due to "corrupted" stream(yes, the *Params* field
> was missing in the dictionary or the *Filter* was wrong). Thus, 462
> compressed objects described in cross-reference streams are lost. Since
> important objects (the Root, the Pages, etc.) referred to objects in 1828 or
> something, all resolved to null (because the corrected XRefOffsets doens't
> have them). Further parsing is impossible.
> However, when I tried to bypass *checkXrefOffsets*, the PDF shows correctly
> without any (noticeable) error. It seemed that object 1336 is not used in the
> PDF.
> "Corrupted" 1828:
> {code:java}
> 1828 0 obj
> <<
> /Length 2176
> /Type /ObjStm
> /N 200
> /First 2103
> /Filter /FlatDecode
> >>
> ...{code}
> It doesn't work well in *bfSearchForObjStreams* but works in
> *parseObjectStream*.
>
> Would it be nice to have a fallback to preserve compressed stream object key
> offsets, when we some error in brute force search?
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]