[
https://issues.apache.org/jira/browse/PDFBOX-4131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16382017#comment-16382017
]
Maison commented on PDFBOX-4131:
--------------------------------
OK, attached a simple file that triggers the bug : fields_loop_kid_is_parent.pdf
File pdf = new File("fields_loop_kid_is_parent.pdf");
PDDocument doc = PDDocument.load(pdf);
PDAcroForm acroForm = doc.getDocumentCatalog().getAcroForm();
PDFieldTree tree = acroForm.getFieldTree();
for (PDField field : tree) {
System.out.println("field = "+field);
}
A quick fix for this case is to add in PDNonTerminalField.getChildren()
{{ if (kid instanceof COSDictionary)}}
{{ {}}
{{+ if (kid.equals(getCOSObject())) {}}
{{+ //System.out.println("RECURSION AVOIDED !");}}
{{+ continue;}}
{{+ }}}
{{ PDField field = PDField.fromDictionary(getAcroForm(),
(COSDictionary) kid, this);}}
Also attached a less simple case, where obj 5 has kid 9, and obj 9 has kid 5:
fields_loop_kid_is_parent_of_other_group.pdf
Both files were generated thanks to libreoffice, then manually edited.
> Stack overflow in fields
> ------------------------
>
> Key: PDFBOX-4131
> URL: https://issues.apache.org/jira/browse/PDFBOX-4131
> Project: PDFBox
> Issue Type: Bug
> Reporter: Maison
> Priority: Major
> Attachments: fields_loop_kid_is_parent.pdf,
> fields_loop_kid_is_parent_of_other_group.pdf
>
>
> Note: unfortunately I can not attach original file for confidentiality reasons
> I have a pdf containing some fields :
>
> /AcroForm 2 0 R
> ...
> 2 0 obj
> <<
> /DR 5 0 R
> /Fields [ ... 15 0 R 16 0 R 17 0 R ...]
> >>
> and then for field object 17 :
> 17 0 obj
> <<
> /DA (/Helv 8 Tf 0 g)
> /FT /Tx
> /Ff 1
> /Kids [17 0 R 88 0 R 89 0 R]
> /MaxLen 0
> /T (Nom)
> >>
> Here we see that this field is contained in its children list : this triggers
> a recursivity loop during parsing (or at least during method call
> getAcroForm() )
> at java.lang.StringBuilder.toString(StringBuilder.java:407)
> at org.apache.pdfbox.cos.PDFDocEncoding.toString(PDFDocEncoding.java:135)
> at org.apache.pdfbox.cos.COSString.getString(COSString.java:203)
> at org.apache.pdfbox.cos.COSDictionary.getString(COSDictionary.java:670)
> at
> org.apache.pdfbox.pdmodel.interactive.form.PDFieldFactory.createField(PDFieldFactory.java:64)
> at
> org.apache.pdfbox.pdmodel.interactive.form.PDField.fromDictionary(PDField.java:80)
> at
> org.apache.pdfbox.pdmodel.interactive.form.PDNonTerminalField.getChildren(PDNonTerminalField.java:139)
> at
> org.apache.pdfbox.pdmodel.interactive.form.PDFieldTree$FieldIterator.enqueueKids(PDFieldTree.java:99)
> at
> org.apache.pdfbox.pdmodel.interactive.form.PDFieldTree$FieldIterator.enqueueKids(PDFieldTree.java:102)
> ...
>
> I use pdfbox 2.0.8
> Similar problem with preflight parser.
> In this case a simple fix would be to ignore children fields that are the
> same as their parent ; however the references loop may be much more
> complicated.
>
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]