[ 
https://issues.apache.org/jira/browse/PDFBOX-4131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16382017#comment-16382017
 ] 

Maison commented on PDFBOX-4131:
--------------------------------

OK, attached a simple file that triggers the bug : fields_loop_kid_is_parent.pdf

        File pdf = new File("fields_loop_kid_is_parent.pdf");
        PDDocument doc = PDDocument.load(pdf);
        PDAcroForm acroForm = doc.getDocumentCatalog().getAcroForm();
        PDFieldTree tree = acroForm.getFieldTree();
        for (PDField field : tree) {
            System.out.println("field = "+field);
        }

A quick fix for this case is to add in PDNonTerminalField.getChildren()

{{            if (kid instanceof COSDictionary)}}
{{            {}}
{{+                if (kid.equals(getCOSObject())) {}}
{{+                    //System.out.println("RECURSION AVOIDED !");}}
{{+                    continue;}}
{{+                }}}
{{                PDField field = PDField.fromDictionary(getAcroForm(), 
(COSDictionary) kid, this);}}

 

Also attached a less simple case, where obj 5 has kid 9, and obj 9 has kid 5:  
fields_loop_kid_is_parent_of_other_group.pdf

Both files were generated thanks to libreoffice, then manually edited.

 

 

> Stack overflow in fields
> ------------------------
>
>                 Key: PDFBOX-4131
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-4131
>             Project: PDFBox
>          Issue Type: Bug
>            Reporter: Maison
>            Priority: Major
>         Attachments: fields_loop_kid_is_parent.pdf, 
> fields_loop_kid_is_parent_of_other_group.pdf
>
>
> Note: unfortunately I can not attach original file for confidentiality reasons
> I have a pdf containing some fields :
>  
> /AcroForm 2 0 R
> ...
> 2 0 obj
> <<
> /DR 5 0 R
> /Fields [ ...  15 0 R 16 0 R 17 0 R ...]
> >>
> and then for field object 17 :
> 17 0 obj
> <<
> /DA (/Helv 8 Tf 0 g)
> /FT /Tx
> /Ff 1
> /Kids [17 0 R 88 0 R 89 0 R]
> /MaxLen 0
> /T (Nom)
> >>
> Here we see that this field is contained in its children list : this triggers 
> a recursivity loop during parsing (or at least during method call 
> getAcroForm() )
>     at java.lang.StringBuilder.toString(StringBuilder.java:407)
>     at org.apache.pdfbox.cos.PDFDocEncoding.toString(PDFDocEncoding.java:135)
>     at org.apache.pdfbox.cos.COSString.getString(COSString.java:203)
>     at org.apache.pdfbox.cos.COSDictionary.getString(COSDictionary.java:670)
>     at 
> org.apache.pdfbox.pdmodel.interactive.form.PDFieldFactory.createField(PDFieldFactory.java:64)
>     at 
> org.apache.pdfbox.pdmodel.interactive.form.PDField.fromDictionary(PDField.java:80)
>     at 
> org.apache.pdfbox.pdmodel.interactive.form.PDNonTerminalField.getChildren(PDNonTerminalField.java:139)
>     at 
> org.apache.pdfbox.pdmodel.interactive.form.PDFieldTree$FieldIterator.enqueueKids(PDFieldTree.java:99)
>     at 
> org.apache.pdfbox.pdmodel.interactive.form.PDFieldTree$FieldIterator.enqueueKids(PDFieldTree.java:102)
> ...
>  
> I use pdfbox 2.0.8
> Similar problem with preflight parser.
> In this case a simple fix would be to ignore children fields that are the 
> same as their parent ; however the references loop may be much more 
> complicated.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org

Reply via email to