[
https://issues.apache.org/jira/browse/PDFBOX-2810?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Andreas Lehmkühler reassigned PDFBOX-2810:
------------------------------------------
Assignee: Andreas Lehmkühler
> Indirect object marked as direct by PDFParser
> ---------------------------------------------
>
> Key: PDFBOX-2810
> URL: https://issues.apache.org/jira/browse/PDFBOX-2810
> Project: PDFBox
> Issue Type: Bug
> Components: Parsing
> Affects Versions: 1.8.9
> Reporter: Petras
> Assignee: Andreas Lehmkühler
> Priority: Major
>
> I've noticed an issue with PDFParser, which marks COSObject of indirect
> reference to COSDictionary as "direct", while dereferenced object
> (COSDictionary) correctly indicate the indirect state.
> Consider this extract from PDF:
> {code}
> 1 0 obj
> <<
> /Type /Catalog
> /Outlines 2 0 R
> /Pages 3 0 R
> >>
> endobj
> 2 0 obj
> <<
> /Type /Outlines
> /Count 0
> >>
> endobj
> 3 0 obj
> <<
> /Type /Pages
> /Kids [4 0 R]
> /Count 1
> >>
> endobj
> 4 0 obj
> <<
> /Type /Page
> ...
> >>
> endobj
> {code}
> Reading catalog dictionary entry "{{/Outlines 2 0 R}}":
> {code}
> final COSDictionary cosCatalog = catalog.getCOSDictionary();
> // WORKS with dereferencing
> final COSBase dictOutlines = cosCatalog.getDictionaryObject(COSName.OUTLINES);
> Assert.assertFalse("Expected /Outlines indirect", dictOutlines.isDirect());
> {code}
> {color:red} FAILS without dereferencing{color}
> {code}
> Assert.assertFalse("Expected /Outlines indirect",
> cosCatalog.getItem(COSName.OUTLINES).isDirect());
> {code}
> The culprit is code in
> {{org.apache.pdfbox.pdfparser.BaseParser#parseCOSDictionary}}, which always
> set COSObject containing COSDictionary as "direct".
> Also noticed, that when indirect COSObject is member of COSArray, its
> "direct" state is not changed to direct. This code works while reading array
> element with of without dereferencing:
> {code}
> // /Pages 3 0 R
> final COSDictionary dictPages = (COSDictionary)
> cosCatalog.getDictionaryObject(COSName.PAGES);
> // /Kids [4 0 R]
> final COSBase objKids = dictPages.getDictionaryObject(COSName.KIDS);
> // WORKS without dereference
> COSBase firsElement = ((COSArray) objKids).get(0);
> Assert.assertFalse("Expected /Kids array element is indirect object",
> firsElement.isDirect());
> // WORKS with dereference
> firsElement = ((COSArray) objKids).getObject(0);
> Assert.assertFalse("Expected /Kids array element is indirect object",
> firsElement.isDirect());
> {code}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]