[ https://issues.apache.org/jira/browse/PDFBOX-4131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16382017#comment-16382017 ]
Maison commented on PDFBOX-4131: -------------------------------- OK, attached a simple file that triggers the bug : fields_loop_kid_is_parent.pdf File pdf = new File("fields_loop_kid_is_parent.pdf"); PDDocument doc = PDDocument.load(pdf); PDAcroForm acroForm = doc.getDocumentCatalog().getAcroForm(); PDFieldTree tree = acroForm.getFieldTree(); for (PDField field : tree) { System.out.println("field = "+field); } A quick fix for this case is to add in PDNonTerminalField.getChildren() {{ if (kid instanceof COSDictionary)}} {{ {}} {{+ if (kid.equals(getCOSObject())) {}} {{+ //System.out.println("RECURSION AVOIDED !");}} {{+ continue;}} {{+ }}} {{ PDField field = PDField.fromDictionary(getAcroForm(), (COSDictionary) kid, this);}} Also attached a less simple case, where obj 5 has kid 9, and obj 9 has kid 5: fields_loop_kid_is_parent_of_other_group.pdf Both files were generated thanks to libreoffice, then manually edited. > Stack overflow in fields > ------------------------ > > Key: PDFBOX-4131 > URL: https://issues.apache.org/jira/browse/PDFBOX-4131 > Project: PDFBox > Issue Type: Bug > Reporter: Maison > Priority: Major > Attachments: fields_loop_kid_is_parent.pdf, > fields_loop_kid_is_parent_of_other_group.pdf > > > Note: unfortunately I can not attach original file for confidentiality reasons > I have a pdf containing some fields : > > /AcroForm 2 0 R > ... > 2 0 obj > << > /DR 5 0 R > /Fields [ ... 15 0 R 16 0 R 17 0 R ...] > >> > and then for field object 17 : > 17 0 obj > << > /DA (/Helv 8 Tf 0 g) > /FT /Tx > /Ff 1 > /Kids [17 0 R 88 0 R 89 0 R] > /MaxLen 0 > /T (Nom) > >> > Here we see that this field is contained in its children list : this triggers > a recursivity loop during parsing (or at least during method call > getAcroForm() ) > at java.lang.StringBuilder.toString(StringBuilder.java:407) > at org.apache.pdfbox.cos.PDFDocEncoding.toString(PDFDocEncoding.java:135) > at org.apache.pdfbox.cos.COSString.getString(COSString.java:203) > at org.apache.pdfbox.cos.COSDictionary.getString(COSDictionary.java:670) > at > org.apache.pdfbox.pdmodel.interactive.form.PDFieldFactory.createField(PDFieldFactory.java:64) > at > org.apache.pdfbox.pdmodel.interactive.form.PDField.fromDictionary(PDField.java:80) > at > org.apache.pdfbox.pdmodel.interactive.form.PDNonTerminalField.getChildren(PDNonTerminalField.java:139) > at > org.apache.pdfbox.pdmodel.interactive.form.PDFieldTree$FieldIterator.enqueueKids(PDFieldTree.java:99) > at > org.apache.pdfbox.pdmodel.interactive.form.PDFieldTree$FieldIterator.enqueueKids(PDFieldTree.java:102) > ... > > I use pdfbox 2.0.8 > Similar problem with preflight parser. > In this case a simple fix would be to ignore children fields that are the > same as their parent ; however the references loop may be much more > complicated. > -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org