Petras created PDFBOX-2810:
------------------------------

             Summary: Indirect object marked as direct by PDFParser
                 Key: PDFBOX-2810
                 URL: https://issues.apache.org/jira/browse/PDFBOX-2810
             Project: PDFBox
          Issue Type: Bug
          Components: Parsing
    Affects Versions: 1.8.9
            Reporter: Petras


I've noticed an issue with PDFParser, which marks COSObject of indirect 
reference to COSDictionary as "direct", while dereferenced object 
(COSDictionary) correctly indicate the indirect state.

Consider this extract from PDF:
{code}
1 0 obj
<<
 /Type /Catalog
 /Outlines 2 0 R
 /Pages 3 0 R
>>
endobj

2 0 obj
<<
 /Type /Outlines
 /Count 0
>>
endobj

3 0 obj
<<
 /Type /Pages
 /Kids [4 0 R]
 /Count 1
>>
endobj

4 0 obj
<<
 /Type /Page
...
>>
endobj
{code}

Reading catalog dictionary entry "{{/Outlines 2 0 R}}":
{code}
final COSDictionary cosCatalog = catalog.getCOSDictionary();

// WORKS with dereferencing
final COSBase dictOutlines = cosCatalog.getDictionaryObject(COSName.OUTLINES);
Assert.assertFalse("Expected /Outlines indirect", dictOutlines.isDirect());
{code}
{color:red} FAILS without dereferencing{color}
{code}
Assert.assertFalse("Expected /Outlines indirect", 
cosCatalog.getItem(COSName.OUTLINES).isDirect());
{code}

The culprit is code in 
{{org.apache.pdfbox.pdfparser.BaseParser#parseCOSDictionary}}, which always set 
COSObject containing COSDictionary as "direct".

Also noticed, that when indirect COSObject is member of COSArray, its "direct" 
state is not changed to direct. This code works while reading array element 
with of without dereferencing:
{code}
// /Pages 3 0 R
final COSDictionary dictPages = (COSDictionary) 
cosCatalog.getDictionaryObject(COSName.PAGES);
// /Kids [4 0 R]
final COSBase objKids = dictPages.getDictionaryObject(COSName.KIDS);

// WORKS without dereference
COSBase firsElement = ((COSArray) objKids).get(0);
Assert.assertFalse("Expected /Kids array element is indirect object", 
firsElement.isDirect());

// WORKS with dereference
firsElement = ((COSArray) objKids).getObject(0);
Assert.assertFalse("Expected /Kids array element is indirect object", 
firsElement.isDirect());
{code}




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to