new line in PDField
Hello, I load a pdf and then fill out some form fields with java. I use the PDField.setValue. One form field is a multi line text field. How do I make a new Line in a multi line text field? Thanks a bunch Timothy [cid:imageb4f5eb.PNG@0cb05586.49a74b26] Sortimo Walter Rüegg AG Grabenackerstrasse 1 8156 Oberhasli Zentrale 044 852 50 60 Fax 044 852 50 70 timothy.trowbri...@sortimo.ch www.sortimo.chhttp://www.sortimo.ch [cid:image1de61c.JPG@903202ac.47a46485]http://issuu.com/sortimo_walter_rueegg/docs/sortimo_gesamtkatalog_de/1
[jira] [Comment Edited] (PDFBOX-2261) Extremely long hang during getFields() on a few PDF files
[ https://issues.apache.org/jira/browse/PDFBOX-2261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14095331#comment-14095331 ] Andreas Lehmkühler edited comment on PDFBOX-2261 at 8/13/14 9:53 AM: - Maybe I wasn't specific enough I've understood that according to the spec everything is fine with the structure of the pdf. But the mentiond piece of code seems wrong to me {code} private static boolean isButton(PDAcroForm form, COSDictionary field) throws IOException { String fieldType = PDField.findFieldType(field); ListCOSObjectable kids = PDField.getKids(form, field); if (fieldType == null kids != null !kids.isEmpty()) { // sometimes if it is a button the type is only defined by one of the kids entries // TODO JH: this is due to inheritance, we need proper support for non-terminal fields COSDictionary kid = (COSDictionary)kids.get(0).getCOSObject(); return isButton(form, kid); } return Btn.equals(fieldType); } {code} The question is, does it make sense to search for a button field typ among the child nodes if the parent node hasn't any field type? IMHO not but maybe I'm missing something. was (Author: lehmi): Maybe I wasn't specific enough I've understood that according to the spec everything is fine with the structure of the pdf. But the mentiond piece of code seems wrong to me {code} private static boolean isButton(PDAcroForm form, COSDictionary field) throws IOException { String fieldType = PDField.findFieldType(field); ListCOSObjectable kids = PDField.getKids(form, field); if (fieldType == null kids != null !kids.isEmpty()) { // sometimes if it is a button the type is only defined by one of the kids entries // TODO JH: this is due to inheritance, we need proper support for non-terminal fields COSDictionary kid = (COSDictionary)kids.get(0).getCOSObject(); return isButton(form, kid); } return Btn.equals(fieldType); } {code} The question is, does it make sense to search for a button field typ among the child nodes if the parent node hasn't any field type? Extremely long hang during getFields() on a few PDF files - Key: PDFBOX-2261 URL: https://issues.apache.org/jira/browse/PDFBOX-2261 Project: PDFBox Issue Type: Bug Components: AcroForm Affects Versions: 1.8.6 Reporter: Tim Allison Priority: Minor Attachments: 966679.pdf, screenshot-pdfdebugger.png When I run oap.examples.fdf.PrintFields from trunk, the code seems to hang during acroForm.getFields(). This is a heavy load hang. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (PDFBOX-2261) Extremely long hang during getFields() on a few PDF files
[ https://issues.apache.org/jira/browse/PDFBOX-2261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14095331#comment-14095331 ] Andreas Lehmkühler commented on PDFBOX-2261: Maybe I wasn't specific enough I've understood that according to the spec everything is fine with the structure of the pdf. But the mentiond piece of code seems wrong to me {code} private static boolean isButton(PDAcroForm form, COSDictionary field) throws IOException { String fieldType = PDField.findFieldType(field); ListCOSObjectable kids = PDField.getKids(form, field); if (fieldType == null kids != null !kids.isEmpty()) { // sometimes if it is a button the type is only defined by one of the kids entries // TODO JH: this is due to inheritance, we need proper support for non-terminal fields COSDictionary kid = (COSDictionary)kids.get(0).getCOSObject(); return isButton(form, kid); } return Btn.equals(fieldType); } {code} The question is, does it make sense to search for a button field typ among the child nodes if the parent node hasn't any field type? Extremely long hang during getFields() on a few PDF files - Key: PDFBOX-2261 URL: https://issues.apache.org/jira/browse/PDFBOX-2261 Project: PDFBox Issue Type: Bug Components: AcroForm Affects Versions: 1.8.6 Reporter: Tim Allison Priority: Minor Attachments: 966679.pdf, screenshot-pdfdebugger.png When I run oap.examples.fdf.PrintFields from trunk, the code seems to hang during acroForm.getFields(). This is a heavy load hang. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (PDFBOX-1511) pdfMerger App produces Garbage
[ https://issues.apache.org/jira/browse/PDFBOX-1511?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14095353#comment-14095353 ] Andreas Lehmkühler commented on PDFBOX-1511: Identically named resources are problematic if 2 or more of the pdfs to be merged are using global resources and if the merger merges the page related resources and the global resources separately as it did befroe the patch. The proposed patch merges by using findResources() instead of getResources() the global and the page specific resources _before_ adding them to the page itself, so that there aren't any duplicted names anymore. I don't know if that was intended in the first place but it solves the problem :-) OTOH pdfs using global resources will grow after merging as all resources are multiplied. But AFAIKT global resources aren't used that often. pdfMerger App produces Garbage -- Key: PDFBOX-1511 URL: https://issues.apache.org/jira/browse/PDFBOX-1511 Project: PDFBox Issue Type: Bug Components: Utilities Affects Versions: 1.7.1 Environment: Win XP; Windows Server 2008 R2; java version 1.6.0_21, Reporter: Michael Huber Fix For: 1.8.7, 2.0.0 Attachments: 1.pdf, 2.pdf, PDFMergerUtility.java, PDFMergerUtility.java.diff, PdfRenderer.java, targetPdfMergeJava.pdf, targetPdfMergeUtilityApp.pdf pdfbox Utility pdfMerger produces a merged document containing garbage. All merged pdf files are contained but Strings are destroyed. The source pdf files are created with graphviz and are readable without error or disturbance both with Acrobat X and pdfbox pdfDebug Utility. Another astounding thing is that a handcoded merger using pdfMergerUtility class works fine when run within Eclipse Juno and creates same garbage when run from cmd line (pls. see attached source PdfRenderer.java) I checked everything that comes in mind to find the differences, e.g. Java version, encoding/codepage issues, memory settings, found nothing. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Comment Edited] (PDFBOX-1511) pdfMerger App produces Garbage
[ https://issues.apache.org/jira/browse/PDFBOX-1511?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14095353#comment-14095353 ] Andreas Lehmkühler edited comment on PDFBOX-1511 at 8/13/14 11:03 AM: -- Identically named resources are problematic if 2 or more of the pdfs to be merged are using global resources and if the merger merges the page related resources and the global resources separately as it did befroe the patch. The proposed patch merges by using findResources() instead of getResources() the global and the page specific resources _before_ adding them to the page itself, so that there aren't any duplicted names anymore. I don't know if that was intended in the first place but it solves the problem :-) OTOH pdfs using global resources will grow after merging as all resources are multiplied. But AFAIKT global resources aren't used that often. was (Author: lehmi): Identically named resources are problematic if 2 or more of the pdfs to be merged are using global resources and if the merger merges the page related resources and the global resources separately as it did befroe the patch. The proposed patch merges by using findResources() instead of getResources() the global and the page specific resources _before_ adding them to the page itself, so that there aren't any duplicted names anymore. I don't know if that was intended in the first place but it solves the problem :-) OTOH pdfs using global resources will grow after merging as all resources are multiplied. But AFAIKT global resources aren't used that often. pdfMerger App produces Garbage -- Key: PDFBOX-1511 URL: https://issues.apache.org/jira/browse/PDFBOX-1511 Project: PDFBox Issue Type: Bug Components: Utilities Affects Versions: 1.7.1 Environment: Win XP; Windows Server 2008 R2; java version 1.6.0_21, Reporter: Michael Huber Fix For: 1.8.7, 2.0.0 Attachments: 1.pdf, 2.pdf, PDFMergerUtility.java, PDFMergerUtility.java.diff, PdfRenderer.java, targetPdfMergeJava.pdf, targetPdfMergeUtilityApp.pdf pdfbox Utility pdfMerger produces a merged document containing garbage. All merged pdf files are contained but Strings are destroyed. The source pdf files are created with graphviz and are readable without error or disturbance both with Acrobat X and pdfbox pdfDebug Utility. Another astounding thing is that a handcoded merger using pdfMergerUtility class works fine when run within Eclipse Juno and creates same garbage when run from cmd line (pls. see attached source PdfRenderer.java) I checked everything that comes in mind to find the differences, e.g. Java version, encoding/codepage issues, memory settings, found nothing. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (PDFBOX-2261) Extremely long hang during getFields() on a few PDF files
[ https://issues.apache.org/jira/browse/PDFBOX-2261?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Maruan Sahyoun updated PDFBOX-2261: --- Attachment: RadioButtons.pdf Sample form with RadioButtons and Pushbuttons to clarify the behavior. As can be seen the parent does not have the field type setting. This is fine as the parents in this case are acting as groups for the containing and nested fields. Extremely long hang during getFields() on a few PDF files - Key: PDFBOX-2261 URL: https://issues.apache.org/jira/browse/PDFBOX-2261 Project: PDFBox Issue Type: Bug Components: AcroForm Affects Versions: 1.8.6 Reporter: Tim Allison Priority: Minor Attachments: 966679.pdf, RadioButtons.pdf, screenshot-pdfdebugger.png When I run oap.examples.fdf.PrintFields from trunk, the code seems to hang during acroForm.getFields(). This is a heavy load hang. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Comment Edited] (PDFBOX-2261) Extremely long hang during getFields() on a few PDF files
[ https://issues.apache.org/jira/browse/PDFBOX-2261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14095419#comment-14095419 ] Maruan Sahyoun edited comment on PDFBOX-2261 at 8/13/14 12:35 PM: -- if it’s a non terminal field wo a field type there is no need to lookup the field type for it IMHO. Maybe change it so if it’s a terminal field for inheritable attributes, such as field type, we do something like {code} field.getInheritableAttribute(‚FT‘) {code} which would look up the parent hierachy if the attribute is not part of the fields dictionary. WDYT was (Author: msahyoun): if it’s a non terminal field wo a field type there is no need to lookup the field type for it IMHO. Maybe change it so if it’s a terminal field for inheritable attributes, such as field type, we do something like {{code}} field.getInheritableAttribute(‚FT‘) {{code}} which would look up the parent hierachy if the attribute is not part of the fields dictionary. WDYT Extremely long hang during getFields() on a few PDF files - Key: PDFBOX-2261 URL: https://issues.apache.org/jira/browse/PDFBOX-2261 Project: PDFBox Issue Type: Bug Components: AcroForm Affects Versions: 1.8.6 Reporter: Tim Allison Priority: Minor Attachments: 966679.pdf, RadioButtons.pdf, screenshot-pdfdebugger.png When I run oap.examples.fdf.PrintFields from trunk, the code seems to hang during acroForm.getFields(). This is a heavy load hang. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (PDFBOX-2261) Extremely long hang during getFields() on a few PDF files
[ https://issues.apache.org/jira/browse/PDFBOX-2261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14095419#comment-14095419 ] Maruan Sahyoun commented on PDFBOX-2261: if it’s a non terminal field wo a field type there is no need to lookup the field type for it IMHO. Maybe change it so if it’s a terminal field for inheritable attributes, such as field type, we do something like {{code}} field.getInheritableAttribute(‚FT‘) {{code}} which would look up the parent hierachy if the attribute is not part of the fields dictionary. WDYT Extremely long hang during getFields() on a few PDF files - Key: PDFBOX-2261 URL: https://issues.apache.org/jira/browse/PDFBOX-2261 Project: PDFBox Issue Type: Bug Components: AcroForm Affects Versions: 1.8.6 Reporter: Tim Allison Priority: Minor Attachments: 966679.pdf, RadioButtons.pdf, screenshot-pdfdebugger.png When I run oap.examples.fdf.PrintFields from trunk, the code seems to hang during acroForm.getFields(). This is a heavy load hang. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Assigned] (PDFBOX-2261) Extremely long hang during getFields() on a few PDF files
[ https://issues.apache.org/jira/browse/PDFBOX-2261?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andreas Lehmkühler reassigned PDFBOX-2261: -- Assignee: Andreas Lehmkühler Extremely long hang during getFields() on a few PDF files - Key: PDFBOX-2261 URL: https://issues.apache.org/jira/browse/PDFBOX-2261 Project: PDFBox Issue Type: Bug Components: AcroForm Affects Versions: 1.8.6 Reporter: Tim Allison Assignee: Andreas Lehmkühler Priority: Minor Attachments: 966679.pdf, RadioButtons.pdf, screenshot-pdfdebugger.png When I run oap.examples.fdf.PrintFields from trunk, the code seems to hang during acroForm.getFields(). This is a heavy load hang. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (PDFBOX-2261) Extremely long hang during getFields() on a few PDF files
[ https://issues.apache.org/jira/browse/PDFBOX-2261?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andreas Lehmkühler updated PDFBOX-2261: --- Fix Version/s: 2.0.0 Extremely long hang during getFields() on a few PDF files - Key: PDFBOX-2261 URL: https://issues.apache.org/jira/browse/PDFBOX-2261 Project: PDFBox Issue Type: Bug Components: AcroForm Affects Versions: 1.8.6 Reporter: Tim Allison Assignee: Andreas Lehmkühler Priority: Minor Fix For: 2.0.0 Attachments: 966679.pdf, RadioButtons.pdf, screenshot-pdfdebugger.png When I run oap.examples.fdf.PrintFields from trunk, the code seems to hang during acroForm.getFields(). This is a heavy load hang. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (PDFBOX-2261) Extremely long hang during getFields() on a few PDF files
[ https://issues.apache.org/jira/browse/PDFBOX-2261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14095639#comment-14095639 ] Maruan Sahyoun commented on PDFBOX-2261: Wouldn’t it be good to start using enums for field type and flags? Extremely long hang during getFields() on a few PDF files - Key: PDFBOX-2261 URL: https://issues.apache.org/jira/browse/PDFBOX-2261 Project: PDFBox Issue Type: Bug Components: AcroForm Affects Versions: 1.8.6 Reporter: Tim Allison Assignee: Andreas Lehmkühler Priority: Minor Fix For: 2.0.0 Attachments: 966679.pdf, RadioButtons.pdf, screenshot-pdfdebugger.png When I run oap.examples.fdf.PrintFields from trunk, the code seems to hang during acroForm.getFields(). This is a heavy load hang. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (PDFBOX-2261) Extremely long hang during getFields() on a few PDF files
[ https://issues.apache.org/jira/browse/PDFBOX-2261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14095648#comment-14095648 ] Andreas Lehmkühler commented on PDFBOX-2261: In a first step I've removed the recursion, which didn't make sense. Now PrintFields comes up with a result within seconds. But the result is incomplete. All top-level fields which are simple dictionaries without a field type are discarded. I'm working on a solution Extremely long hang during getFields() on a few PDF files - Key: PDFBOX-2261 URL: https://issues.apache.org/jira/browse/PDFBOX-2261 Project: PDFBox Issue Type: Bug Components: AcroForm Affects Versions: 1.8.6 Reporter: Tim Allison Assignee: Andreas Lehmkühler Priority: Minor Fix For: 2.0.0 Attachments: 966679.pdf, RadioButtons.pdf, screenshot-pdfdebugger.png When I run oap.examples.fdf.PrintFields from trunk, the code seems to hang during acroForm.getFields(). This is a heavy load hang. -- This message was sent by Atlassian JIRA (v6.2#6252)
Build failed in Jenkins: PDFBox-trunk » PDFBox parent #1200
See https://builds.apache.org/job/PDFBox-trunk/org.apache.pdfbox$pdfbox-parent/1200/ -- maven3-agent.jar already up to date maven3-interceptor.jar already up to date maven3-interceptor-commons.jar already up to date ===[JENKINS REMOTING CAPACITY]=== channel started log4j:WARN No appenders could be found for logger (org.apache.commons.beanutils.converters.BooleanConverter). log4j:WARN Please initialize the log4j system properly. Executing Maven: -B -f /home/jenkins/jenkins-slave/workspace/PDFBox-trunk/trunk/pom.xml -Dmaven.repo.local=/home/jenkins/jenkins-slave/maven-repositories/0 clean deploy -Ppedantic [INFO] Scanning for projects... [INFO] [INFO] Reactor Build Order: [INFO] [INFO] PDFBox parent [INFO] Apache FontBox [INFO] Apache JempBox [INFO] Apache XmpBox [INFO] Apache PDFBox [INFO] Apache Preflight [INFO] Apache Preflight application [INFO] Apache PDFBox tools [INFO] Apache PDFBox application [INFO] Apache PDFBox examples [INFO] PDFBox reactor [INFO] [INFO] [INFO] Building PDFBox parent 2.0.0-SNAPSHOT [INFO] [INFO] [INFO] --- maven-clean-plugin:2.5:clean (default-clean) @ pdfbox-parent --- [TASKS] Scanning folder 'https://builds.apache.org/job/PDFBox-trunk/org.apache.pdfbox$pdfbox-parent/ws/' for files matching the pattern '**/*.java' - excludes: [TASKS] Found 0 files to scan for tasks Found 0 open tasks. [TASKS] Computing warning deltas based on reference build #1199 [INFO] [INFO] --- maven-remote-resources-plugin:1.5:process (default) @ pdfbox-parent --- [INFO] [INFO] --- maven-site-plugin:3.3:attach-descriptor (attach-descriptor) @ pdfbox-parent --- [INFO] [INFO] --- apache-rat-plugin:0.10:check (default) @ pdfbox-parent --- [INFO] 51 implicit excludes (use -debug for more details). [INFO] Exclude: release.properties [INFO] 1 resources included (use -debug for more details) [INFO] Rat check: Summary of files. Unapproved: 0 unknown: 0 generated: 0 approved: 1 licence. [INFO] [INFO] --- maven-install-plugin:2.5.1:install (default-install) @ pdfbox-parent --- [INFO] Installing https://builds.apache.org/job/PDFBox-trunk/org.apache.pdfbox$pdfbox-parent/ws/pom.xml to /home/jenkins/jenkins-slave/maven-repositories/0/org/apache/pdfbox/pdfbox-parent/2.0.0-SNAPSHOT/pdfbox-parent-2.0.0-SNAPSHOT.pom [INFO] [INFO] --- maven-deploy-plugin:2.8.1:deploy (default-deploy) @ pdfbox-parent --- Downloading: https://repository.apache.org/content/repositories/snapshots/org/apache/pdfbox/pdfbox-parent/2.0.0-SNAPSHOT/maven-metadata.xml Downloaded: https://repository.apache.org/content/repositories/snapshots/org/apache/pdfbox/pdfbox-parent/2.0.0-SNAPSHOT/maven-metadata.xml (611 B at 0.0 KB/sec) Uploading: https://repository.apache.org/content/repositories/snapshots/org/apache/pdfbox/pdfbox-parent/2.0.0-SNAPSHOT/pdfbox-parent-2.0.0-20140813.160457-535.pom Uploaded: https://repository.apache.org/content/repositories/snapshots/org/apache/pdfbox/pdfbox-parent/2.0.0-SNAPSHOT/pdfbox-parent-2.0.0-20140813.160457-535.pom (12 KB at 0.2 KB/sec) Downloading: https://repository.apache.org/content/repositories/snapshots/org/apache/pdfbox/pdfbox-parent/maven-metadata.xml Downloaded: https://repository.apache.org/content/repositories/snapshots/org/apache/pdfbox/pdfbox-parent/maven-metadata.xml (390 B at 0.0 KB/sec) Uploading: https://repository.apache.org/content/repositories/snapshots/org/apache/pdfbox/pdfbox-parent/2.0.0-SNAPSHOT/maven-metadata.xml Uploaded: https://repository.apache.org/content/repositories/snapshots/org/apache/pdfbox/pdfbox-parent/2.0.0-SNAPSHOT/maven-metadata.xml (611 B at 0.2 KB/sec) Uploading: https://repository.apache.org/content/repositories/snapshots/org/apache/pdfbox/pdfbox-parent/maven-metadata.xml
[jira] [Commented] (PDFBOX-2261) Extremely long hang during getFields() on a few PDF files
[ https://issues.apache.org/jira/browse/PDFBOX-2261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14095841#comment-14095841 ] John Hewson commented on PDFBOX-2261: - I encountered this issue in PDFBOX-2164 but only added a workaround for a specific NPE. The recursive approach used by PDField was indeed incorrect, the PDF spec explains why: {quote} For purposes of definition and naming, the fields can be organized hierarchically and can inherit attributes from their ancestors in the field hierarchy {quote} It seems that the problem with PDFBox's current design is that each node in the field tree is represented by a PDField, however not every node in the field tree is really a field, some nodes are just there to organise the tree structure. One solution would be to have PDAcroForm read the field tree and have it produce a MapString, PDField of named fields, with all of the inheritance taken into account. Another solution would be to have fields be aware of their parent in the field tree and look-up appropriate values (this would preserve the field tree structure between writes), but the parent node should not be a PDField (!!!) it should be PDNonTerminalField or some similar new class, the PDF spec is clear on this: {quote} A non-terminal field does not logically have a type of its own; it is merely a container for inheritable attributes that are intended for descendant terminal fields of any type. {quote} Extremely long hang during getFields() on a few PDF files - Key: PDFBOX-2261 URL: https://issues.apache.org/jira/browse/PDFBOX-2261 Project: PDFBox Issue Type: Bug Components: AcroForm Affects Versions: 1.8.6 Reporter: Tim Allison Assignee: Andreas Lehmkühler Priority: Minor Fix For: 2.0.0 Attachments: 966679.pdf, RadioButtons.pdf, screenshot-pdfdebugger.png When I run oap.examples.fdf.PrintFields from trunk, the code seems to hang during acroForm.getFields(). This is a heavy load hang. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Comment Edited] (PDFBOX-2261) Extremely long hang during getFields() on a few PDF files
[ https://issues.apache.org/jira/browse/PDFBOX-2261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14095841#comment-14095841 ] John Hewson edited comment on PDFBOX-2261 at 8/13/14 6:15 PM: -- I encountered this issue in PDFBOX-2164 but only added a workaround for a specific NPE. The recursive approach used by PDField was indeed incorrect, the PDF spec explains why: {quote} For purposes of definition and naming, the fields can be organized hierarchically and can inherit attributes from their ancestors in the field hierarchy {quote} It seems that the problem with PDFBox's current design is that each node in the field tree is represented by a PDField, however not every node in the field tree is really a field, some nodes are just there to organise the tree structure. One solution would be to have PDAcroForm read the field tree and have it produce a MapString, PDField of named fields, with all of the inheritance taken into account. Another solution would be to have fields be aware of their parent in the field tree and look-up appropriate values (this would preserve the field tree structure between writes), but the parent node should not be a PDField (!!!) it should be PDNonTerminalField* or some similar new class, the PDF spec is clear on this: {quote} A non-terminal field does not logically have a type of its own; it is merely a container for inheritable attributes that are intended for descendant terminal fields of any type. {quote} \* Any new PDNonTerminalField class should not inherit from PDField, either. was (Author: jahewson): I encountered this issue in PDFBOX-2164 but only added a workaround for a specific NPE. The recursive approach used by PDField was indeed incorrect, the PDF spec explains why: {quote} For purposes of definition and naming, the fields can be organized hierarchically and can inherit attributes from their ancestors in the field hierarchy {quote} It seems that the problem with PDFBox's current design is that each node in the field tree is represented by a PDField, however not every node in the field tree is really a field, some nodes are just there to organise the tree structure. One solution would be to have PDAcroForm read the field tree and have it produce a MapString, PDField of named fields, with all of the inheritance taken into account. Another solution would be to have fields be aware of their parent in the field tree and look-up appropriate values (this would preserve the field tree structure between writes), but the parent node should not be a PDField (!!!) it should be PDNonTerminalField* or some similar new class, the PDF spec is clear on this: {quote} A non-terminal field does not logically have a type of its own; it is merely a container for inheritable attributes that are intended for descendant terminal fields of any type. {quote} * Any new PDNonTerminalField class should not inherit from PDField, either. Extremely long hang during getFields() on a few PDF files - Key: PDFBOX-2261 URL: https://issues.apache.org/jira/browse/PDFBOX-2261 Project: PDFBox Issue Type: Bug Components: AcroForm Affects Versions: 1.8.6 Reporter: Tim Allison Assignee: Andreas Lehmkühler Priority: Minor Fix For: 2.0.0 Attachments: 966679.pdf, RadioButtons.pdf, screenshot-pdfdebugger.png When I run oap.examples.fdf.PrintFields from trunk, the code seems to hang during acroForm.getFields(). This is a heavy load hang. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Comment Edited] (PDFBOX-2261) Extremely long hang during getFields() on a few PDF files
[ https://issues.apache.org/jira/browse/PDFBOX-2261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14095841#comment-14095841 ] John Hewson edited comment on PDFBOX-2261 at 8/13/14 6:15 PM: -- I encountered this issue in PDFBOX-2164 but only added a workaround for a specific NPE. The recursive approach used by PDField was indeed incorrect, the PDF spec explains why: {quote} For purposes of definition and naming, the fields can be organized hierarchically and can inherit attributes from their ancestors in the field hierarchy {quote} It seems that the problem with PDFBox's current design is that each node in the field tree is represented by a PDField, however not every node in the field tree is really a field, some nodes are just there to organise the tree structure. One solution would be to have PDAcroForm read the field tree and have it produce a MapString, PDField of named fields, with all of the inheritance taken into account. Another solution would be to have fields be aware of their parent in the field tree and look-up appropriate values (this would preserve the field tree structure between writes), but the parent node should not be a PDField (!!!) it should be PDNonTerminalField* or some similar new class, the PDF spec is clear on this: {quote} A non-terminal field does not logically have a type of its own; it is merely a container for inheritable attributes that are intended for descendant terminal fields of any type. {quote} * Any new PDNonTerminalField class should not inherit from PDField, either. was (Author: jahewson): I encountered this issue in PDFBOX-2164 but only added a workaround for a specific NPE. The recursive approach used by PDField was indeed incorrect, the PDF spec explains why: {quote} For purposes of definition and naming, the fields can be organized hierarchically and can inherit attributes from their ancestors in the field hierarchy {quote} It seems that the problem with PDFBox's current design is that each node in the field tree is represented by a PDField, however not every node in the field tree is really a field, some nodes are just there to organise the tree structure. One solution would be to have PDAcroForm read the field tree and have it produce a MapString, PDField of named fields, with all of the inheritance taken into account. Another solution would be to have fields be aware of their parent in the field tree and look-up appropriate values (this would preserve the field tree structure between writes), but the parent node should not be a PDField (!!!) it should be PDNonTerminalField or some similar new class, the PDF spec is clear on this: {quote} A non-terminal field does not logically have a type of its own; it is merely a container for inheritable attributes that are intended for descendant terminal fields of any type. {quote} Note: Any new PDNonTerminalField class should not inherit from PDField, either. Extremely long hang during getFields() on a few PDF files - Key: PDFBOX-2261 URL: https://issues.apache.org/jira/browse/PDFBOX-2261 Project: PDFBox Issue Type: Bug Components: AcroForm Affects Versions: 1.8.6 Reporter: Tim Allison Assignee: Andreas Lehmkühler Priority: Minor Fix For: 2.0.0 Attachments: 966679.pdf, RadioButtons.pdf, screenshot-pdfdebugger.png When I run oap.examples.fdf.PrintFields from trunk, the code seems to hang during acroForm.getFields(). This is a heavy load hang. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Comment Edited] (PDFBOX-2261) Extremely long hang during getFields() on a few PDF files
[ https://issues.apache.org/jira/browse/PDFBOX-2261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14095841#comment-14095841 ] John Hewson edited comment on PDFBOX-2261 at 8/13/14 6:18 PM: -- I encountered this issue in PDFBOX-2164 but only added a workaround for a specific NPE. The recursive approach used by PDField was indeed incorrect, the PDF spec explains why: {quote} For purposes of definition and naming, the fields can be organized hierarchically and can inherit attributes from their ancestors in the field hierarchy {quote} It seems that the problem with PDFBox's current design is that each node in the field tree is represented by a PDField, however not every node in the field tree is really a field, some nodes are just there to organise the tree structure. One solution would be to have PDAcroForm read the field tree and have it produce a MapString, PDField of named fields, with all of the inheritance taken into account. Another solution would be to have fields be aware of their parent in the field tree and look-up appropriate values (this would preserve the field tree structure between writes), but the parent node should not be a PDField (!!!) it should be PDNonTerminalField* or some similar new class, the PDF spec is clear on this: {quote} A non-terminal field does not logically have a type of its own; it is merely a container for inheritable attributes that are intended for descendant terminal fields of any type. {quote} \* Any new PDNonTerminalField class should probably not inherit from PDField, either. was (Author: jahewson): I encountered this issue in PDFBOX-2164 but only added a workaround for a specific NPE. The recursive approach used by PDField was indeed incorrect, the PDF spec explains why: {quote} For purposes of definition and naming, the fields can be organized hierarchically and can inherit attributes from their ancestors in the field hierarchy {quote} It seems that the problem with PDFBox's current design is that each node in the field tree is represented by a PDField, however not every node in the field tree is really a field, some nodes are just there to organise the tree structure. One solution would be to have PDAcroForm read the field tree and have it produce a MapString, PDField of named fields, with all of the inheritance taken into account. Another solution would be to have fields be aware of their parent in the field tree and look-up appropriate values (this would preserve the field tree structure between writes), but the parent node should not be a PDField (!!!) it should be PDNonTerminalField* or some similar new class, the PDF spec is clear on this: {quote} A non-terminal field does not logically have a type of its own; it is merely a container for inheritable attributes that are intended for descendant terminal fields of any type. {quote} \* Any new PDNonTerminalField class should not inherit from PDField, either. Extremely long hang during getFields() on a few PDF files - Key: PDFBOX-2261 URL: https://issues.apache.org/jira/browse/PDFBOX-2261 Project: PDFBox Issue Type: Bug Components: AcroForm Affects Versions: 1.8.6 Reporter: Tim Allison Assignee: Andreas Lehmkühler Priority: Minor Fix For: 2.0.0 Attachments: 966679.pdf, RadioButtons.pdf, screenshot-pdfdebugger.png When I run oap.examples.fdf.PrintFields from trunk, the code seems to hang during acroForm.getFields(). This is a heavy load hang. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (PDFBOX-2261) Extremely long hang during getFields() on a few PDF files
[ https://issues.apache.org/jira/browse/PDFBOX-2261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14095967#comment-14095967 ] Andreas Lehmkühler commented on PDFBOX-2261: I agree with John, we have to introduce a new class like PDNonTerminalField (mine is called PDFieldDictionary) and yes it shouldn't inherit from PDField. It should be the other way around. PDField contains most of the inheritable values which should be moved to the new class and the the others, such as T, TU, TM and AA should be left in PDField. Kids, Parent and FT should be moved too. That would follow the spec and each node of the tree would be represented by a single object. Extremely long hang during getFields() on a few PDF files - Key: PDFBOX-2261 URL: https://issues.apache.org/jira/browse/PDFBOX-2261 Project: PDFBox Issue Type: Bug Components: AcroForm Affects Versions: 1.8.6 Reporter: Tim Allison Assignee: Andreas Lehmkühler Priority: Minor Fix For: 2.0.0 Attachments: 966679.pdf, RadioButtons.pdf, screenshot-pdfdebugger.png When I run oap.examples.fdf.PrintFields from trunk, the code seems to hang during acroForm.getFields(). This is a heavy load hang. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (PDFBOX-1511) pdfMerger App produces Garbage
[ https://issues.apache.org/jira/browse/PDFBOX-1511?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14095996#comment-14095996 ] Tilman Hausherr commented on PDFBOX-1511: - To verify this we need - a file with global resources - to create an uncompressed copy where we shuffle the names of the resources - merge it and see what happens. I did try it and no mayhem followed, there were no longer global resources in the merged file. However I can't share the file (PDFBOX-2048) but I need one that I can attach it here so that Michael and Kirk can also have a look. Now that GSoC2014 is done and weather is less warm I'll run my tests on the digitalcorpora site until I hit a file with global resources. {code} PDResources globalRes = document.getDocumentCatalog().getPages().getResources(); if (globalRes != null) { System.out.println (global resources size: + globalRes.getXObjects().size()); for (String key : globalRes.getXObjects().keySet()) { System.out.println (global resource: + key); } } else System.out.println (no global resources); {code} pdfMerger App produces Garbage -- Key: PDFBOX-1511 URL: https://issues.apache.org/jira/browse/PDFBOX-1511 Project: PDFBox Issue Type: Bug Components: Utilities Affects Versions: 1.7.1 Environment: Win XP; Windows Server 2008 R2; java version 1.6.0_21, Reporter: Michael Huber Fix For: 1.8.7, 2.0.0 Attachments: 1.pdf, 2.pdf, PDFMergerUtility.java, PDFMergerUtility.java.diff, PdfRenderer.java, targetPdfMergeJava.pdf, targetPdfMergeUtilityApp.pdf pdfbox Utility pdfMerger produces a merged document containing garbage. All merged pdf files are contained but Strings are destroyed. The source pdf files are created with graphviz and are readable without error or disturbance both with Acrobat X and pdfbox pdfDebug Utility. Another astounding thing is that a handcoded merger using pdfMergerUtility class works fine when run within Eclipse Juno and creates same garbage when run from cmd line (pls. see attached source PdfRenderer.java) I checked everything that comes in mind to find the differences, e.g. Java version, encoding/codepage issues, memory settings, found nothing. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (PDFBOX-1511) pdfMerger App produces Garbage
[ https://issues.apache.org/jira/browse/PDFBOX-1511?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tilman Hausherr updated PDFBOX-1511: Attachment: 078117u2.pdf 078117u1.pdf The two 078117*.pdf files have global resources. The difference between the two files is that I have swapped /F1 with F9, and /Im11 with /Im14 everywhere in the second file. I can't attach the result file after merge, but it displays fine, try it yourself. The merged file has no global resources. pdfMerger App produces Garbage -- Key: PDFBOX-1511 URL: https://issues.apache.org/jira/browse/PDFBOX-1511 Project: PDFBox Issue Type: Bug Components: Utilities Affects Versions: 1.7.1 Environment: Win XP; Windows Server 2008 R2; java version 1.6.0_21, Reporter: Michael Huber Fix For: 1.8.7, 2.0.0 Attachments: 078117u1.pdf, 078117u2.pdf, 1.pdf, 2.pdf, PDFMergerUtility.java, PDFMergerUtility.java.diff, PdfRenderer.java, targetPdfMergeJava.pdf, targetPdfMergeUtilityApp.pdf pdfbox Utility pdfMerger produces a merged document containing garbage. All merged pdf files are contained but Strings are destroyed. The source pdf files are created with graphviz and are readable without error or disturbance both with Acrobat X and pdfbox pdfDebug Utility. Another astounding thing is that a handcoded merger using pdfMergerUtility class works fine when run within Eclipse Juno and creates same garbage when run from cmd line (pls. see attached source PdfRenderer.java) I checked everything that comes in mind to find the differences, e.g. Java version, encoding/codepage issues, memory settings, found nothing. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Comment Edited] (PDFBOX-1511) pdfMerger App produces Garbage
[ https://issues.apache.org/jira/browse/PDFBOX-1511?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14096029#comment-14096029 ] Tilman Hausherr edited comment on PDFBOX-1511 at 8/13/14 8:12 PM: -- The two 078117*.pdf files have global resources. The difference between the two files is that I have swapped /F1 with F9, and /Im11 with /Im14 everywhere in the second file. I can't attach the result file after merge because it is too large, but it displays fine, try it yourself. The merged file has no global resources. was (Author: tilman): The two 078117*.pdf files have global resources. The difference between the two files is that I have swapped /F1 with F9, and /Im11 with /Im14 everywhere in the second file. I can't attach the result file after merge, but it displays fine, try it yourself. The merged file has no global resources. pdfMerger App produces Garbage -- Key: PDFBOX-1511 URL: https://issues.apache.org/jira/browse/PDFBOX-1511 Project: PDFBox Issue Type: Bug Components: Utilities Affects Versions: 1.7.1 Environment: Win XP; Windows Server 2008 R2; java version 1.6.0_21, Reporter: Michael Huber Fix For: 1.8.7, 2.0.0 Attachments: 078117u1.pdf, 078117u2.pdf, 1.pdf, 2.pdf, PDFMergerUtility.java, PDFMergerUtility.java.diff, PdfRenderer.java, targetPdfMergeJava.pdf, targetPdfMergeUtilityApp.pdf pdfbox Utility pdfMerger produces a merged document containing garbage. All merged pdf files are contained but Strings are destroyed. The source pdf files are created with graphviz and are readable without error or disturbance both with Acrobat X and pdfbox pdfDebug Utility. Another astounding thing is that a handcoded merger using pdfMergerUtility class works fine when run within Eclipse Juno and creates same garbage when run from cmd line (pls. see attached source PdfRenderer.java) I checked everything that comes in mind to find the differences, e.g. Java version, encoding/codepage issues, memory settings, found nothing. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Comment Edited] (PDFBOX-1511) pdfMerger App produces Garbage
[ https://issues.apache.org/jira/browse/PDFBOX-1511?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14096071#comment-14096071 ] Tilman Hausherr edited comment on PDFBOX-1511 at 8/13/14 8:33 PM: -- Another file (078118.pdf), with global xobject resources: global resource: Im78 global resource: Im79 global resource: Im60 global resource: Im61 global resource: Im75 global resource: Im39 global resource: Im11 global resource: Im12 global resource: Im36 global resource: Im80 global resource: Im53 global resource: Im83 global resource: Im16 global resource: Im33 global resource: Im32 global resource: Im31 global resource: Im57 global resource: Im19 global resource: Im56 global resource: Im68 global resource: Im84 global resource: Im65 global resource: Im62 global resource: Im50 global resource: Im4 global resource: Im49 global resource: Im27 global resource: Im26 global resource: Im3 global resource: Im28 global resource: Im72 global resource: Im71 global resource: Im43 global resource: Im42 The 078117u1.pdf file has these global xobject resources: global resource: Im96 global resource: Im73 global resource: Im59 global resource: Im14 global resource: Tr22 global resource: Im11 global resource: Im17 global resource: Im18 global resource: Im35 global resource: Im53 global resource: Im15 global resource: Im34 global resource: Im83 global resource: Im16 global resource: Im82 global resource: Im58 global resource: Im57 global resource: Im30 global resource: Im87 global resource: Im84 global resource: Im66 global resource: Im67 global resource: Im88 global resource: Im89 global resource: Im5 global resource: Im6 global resource: Im29 global resource: Im28 global resource: Im23 global resource: Im41 global resource: Im90 global resource: Im72 global resource: Im40 global resource: Im71 global resource: Im24 global resource: Im45 global resource: Im21 global resource: Im46 Merging them works too. was (Author: tilman): Another file, with global xobject resources: global resource: Im78 global resource: Im79 global resource: Im60 global resource: Im61 global resource: Im75 global resource: Im39 global resource: Im11 global resource: Im12 global resource: Im36 global resource: Im80 global resource: Im53 global resource: Im83 global resource: Im16 global resource: Im33 global resource: Im32 global resource: Im31 global resource: Im57 global resource: Im19 global resource: Im56 global resource: Im68 global resource: Im84 global resource: Im65 global resource: Im62 global resource: Im50 global resource: Im4 global resource: Im49 global resource: Im27 global resource: Im26 global resource: Im3 global resource: Im28 global resource: Im72 global resource: Im71 global resource: Im43 global resource: Im42 The 078117u1.pdf file has these global xobject resources: global resource: Im96 global resource: Im73 global resource: Im59 global resource: Im14 global resource: Tr22 global resource: Im11 global resource: Im17 global resource: Im18 global resource: Im35 global resource: Im53 global resource: Im15 global resource: Im34 global resource: Im83 global resource: Im16 global resource: Im82 global resource: Im58 global resource: Im57 global resource: Im30 global resource: Im87 global resource: Im84 global resource: Im66 global resource: Im67 global resource: Im88 global resource: Im89 global resource: Im5 global resource: Im6 global resource: Im29 global resource: Im28 global resource: Im23 global resource: Im41 global resource: Im90 global resource: Im72 global resource: Im40 global resource: Im71 global resource: Im24 global resource: Im45 global resource: Im21 global resource: Im46 Merging them works too. pdfMerger App produces Garbage -- Key: PDFBOX-1511 URL: https://issues.apache.org/jira/browse/PDFBOX-1511 Project: PDFBox Issue Type: Bug Components: Utilities Affects Versions: 1.7.1 Environment: Win XP; Windows Server 2008 R2; java version 1.6.0_21, Reporter: Michael Huber Fix For: 1.8.7, 2.0.0 Attachments: 078117u1.pdf, 078117u2.pdf, 078118.pdf, 1.pdf, 2.pdf, PDFMergerUtility.java, PDFMergerUtility.java.diff, PdfRenderer.java, targetPdfMergeJava.pdf, targetPdfMergeUtilityApp.pdf pdfbox Utility pdfMerger produces a merged document containing garbage. All merged pdf files are contained but Strings are destroyed. The source pdf files are created with graphviz and are readable without error or disturbance both with Acrobat X and pdfbox pdfDebug Utility. Another astounding thing is that a handcoded merger using pdfMergerUtility class works fine when run within Eclipse Juno and creates same garbage when run from cmd line (pls. see attached source PdfRenderer.java) I checked everything that comes in mind to find the differences, e.g. Java version,
[jira] [Updated] (PDFBOX-1511) pdfMerger App produces Garbage
[ https://issues.apache.org/jira/browse/PDFBOX-1511?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tilman Hausherr updated PDFBOX-1511: Attachment: 078118.pdf Another file, with global xobject resources: global resource: Im78 global resource: Im79 global resource: Im60 global resource: Im61 global resource: Im75 global resource: Im39 global resource: Im11 global resource: Im12 global resource: Im36 global resource: Im80 global resource: Im53 global resource: Im83 global resource: Im16 global resource: Im33 global resource: Im32 global resource: Im31 global resource: Im57 global resource: Im19 global resource: Im56 global resource: Im68 global resource: Im84 global resource: Im65 global resource: Im62 global resource: Im50 global resource: Im4 global resource: Im49 global resource: Im27 global resource: Im26 global resource: Im3 global resource: Im28 global resource: Im72 global resource: Im71 global resource: Im43 global resource: Im42 The 078117u1.pdf file has these global xobject resources: global resource: Im96 global resource: Im73 global resource: Im59 global resource: Im14 global resource: Tr22 global resource: Im11 global resource: Im17 global resource: Im18 global resource: Im35 global resource: Im53 global resource: Im15 global resource: Im34 global resource: Im83 global resource: Im16 global resource: Im82 global resource: Im58 global resource: Im57 global resource: Im30 global resource: Im87 global resource: Im84 global resource: Im66 global resource: Im67 global resource: Im88 global resource: Im89 global resource: Im5 global resource: Im6 global resource: Im29 global resource: Im28 global resource: Im23 global resource: Im41 global resource: Im90 global resource: Im72 global resource: Im40 global resource: Im71 global resource: Im24 global resource: Im45 global resource: Im21 global resource: Im46 Merging them works too. pdfMerger App produces Garbage -- Key: PDFBOX-1511 URL: https://issues.apache.org/jira/browse/PDFBOX-1511 Project: PDFBox Issue Type: Bug Components: Utilities Affects Versions: 1.7.1 Environment: Win XP; Windows Server 2008 R2; java version 1.6.0_21, Reporter: Michael Huber Fix For: 1.8.7, 2.0.0 Attachments: 078117u1.pdf, 078117u2.pdf, 078118.pdf, 1.pdf, 2.pdf, PDFMergerUtility.java, PDFMergerUtility.java.diff, PdfRenderer.java, targetPdfMergeJava.pdf, targetPdfMergeUtilityApp.pdf pdfbox Utility pdfMerger produces a merged document containing garbage. All merged pdf files are contained but Strings are destroyed. The source pdf files are created with graphviz and are readable without error or disturbance both with Acrobat X and pdfbox pdfDebug Utility. Another astounding thing is that a handcoded merger using pdfMergerUtility class works fine when run within Eclipse Juno and creates same garbage when run from cmd line (pls. see attached source PdfRenderer.java) I checked everything that comes in mind to find the differences, e.g. Java version, encoding/codepage issues, memory settings, found nothing. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (PDFBOX-2261) Extremely long hang during getFields() on a few PDF files
[ https://issues.apache.org/jira/browse/PDFBOX-2261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14096104#comment-14096104 ] John Hewson commented on PDFBOX-2261: - Sounds good. One small thought: given that a PDField already represents a dictionary, the name PDFieldDictionary is ambiguous, as the Dictionary is implicit in most existing PDFBox names, e.g. PDFont wraps a Font Dictionary. I'd avoid the suffix Dictionary, most PDFBox classes don't use it. Extremely long hang during getFields() on a few PDF files - Key: PDFBOX-2261 URL: https://issues.apache.org/jira/browse/PDFBOX-2261 Project: PDFBox Issue Type: Bug Components: AcroForm Affects Versions: 1.8.6 Reporter: Tim Allison Assignee: Andreas Lehmkühler Priority: Minor Fix For: 2.0.0 Attachments: 966679.pdf, RadioButtons.pdf, screenshot-pdfdebugger.png When I run oap.examples.fdf.PrintFields from trunk, the code seems to hang during acroForm.getFields(). This is a heavy load hang. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Comment Edited] (PDFBOX-2261) Extremely long hang during getFields() on a few PDF files
[ https://issues.apache.org/jira/browse/PDFBOX-2261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14096104#comment-14096104 ] John Hewson edited comment on PDFBOX-2261 at 8/13/14 8:55 PM: -- Sounds good. One small thought: given that a PDField already represents a dictionary, the name PDFieldDictionary is ambiguous, as the Dictionary is implicit in most existing PDFBox names, e.g. PDFont wraps a Font Dictionary. I'd avoid the suffix Dictionary, most PDFBox classes don't use it, especially as a dictionary is a COS-level concept. was (Author: jahewson): Sounds good. One small thought: given that a PDField already represents a dictionary, the name PDFieldDictionary is ambiguous, as the Dictionary is implicit in most existing PDFBox names, e.g. PDFont wraps a Font Dictionary. I'd avoid the suffix Dictionary, most PDFBox classes don't use it. Extremely long hang during getFields() on a few PDF files - Key: PDFBOX-2261 URL: https://issues.apache.org/jira/browse/PDFBOX-2261 Project: PDFBox Issue Type: Bug Components: AcroForm Affects Versions: 1.8.6 Reporter: Tim Allison Assignee: Andreas Lehmkühler Priority: Minor Fix For: 2.0.0 Attachments: 966679.pdf, RadioButtons.pdf, screenshot-pdfdebugger.png When I run oap.examples.fdf.PrintFields from trunk, the code seems to hang during acroForm.getFields(). This is a heavy load hang. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Comment Edited] (PDFBOX-2261) Extremely long hang during getFields() on a few PDF files
[ https://issues.apache.org/jira/browse/PDFBOX-2261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14096104#comment-14096104 ] John Hewson edited comment on PDFBOX-2261 at 8/13/14 8:56 PM: -- [~lehmi], sounds good. One small thought: given that a PDField already represents a dictionary, the name PDFieldDictionary is ambiguous, as the Dictionary is implicit in most existing PDFBox names, e.g. PDFont wraps a Font Dictionary. I'd avoid the suffix Dictionary, most PDFBox classes don't use it, especially as a dictionary is a COS-level concept. was (Author: jahewson): Sounds good. One small thought: given that a PDField already represents a dictionary, the name PDFieldDictionary is ambiguous, as the Dictionary is implicit in most existing PDFBox names, e.g. PDFont wraps a Font Dictionary. I'd avoid the suffix Dictionary, most PDFBox classes don't use it, especially as a dictionary is a COS-level concept. Extremely long hang during getFields() on a few PDF files - Key: PDFBOX-2261 URL: https://issues.apache.org/jira/browse/PDFBOX-2261 Project: PDFBox Issue Type: Bug Components: AcroForm Affects Versions: 1.8.6 Reporter: Tim Allison Assignee: Andreas Lehmkühler Priority: Minor Fix For: 2.0.0 Attachments: 966679.pdf, RadioButtons.pdf, screenshot-pdfdebugger.png When I run oap.examples.fdf.PrintFields from trunk, the code seems to hang during acroForm.getFields(). This is a heavy load hang. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Comment Edited] (PDFBOX-2261) Extremely long hang during getFields() on a few PDF files
[ https://issues.apache.org/jira/browse/PDFBOX-2261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14096143#comment-14096143 ] John Hewson edited comment on PDFBOX-2261 at 8/13/14 9:21 PM: -- [~msahyoun], I'm not sure that anyone was suggesting that PDField should inherit from PDNonTerminalField, as you say that wouldn't be correct. Here's what I had in mind: if we want to preserve the existing notion that a PDField represents an actual field (i.e. a non-terminal field) then we could use a class hierarchy like that below: {code} abstract class PDFieldTreeNode class PDNonTerminalField extends PDFieldTreeNode class PDField extends PDFieldTreeNode {code} And the following constructors: {code} protected PDFieldTreeNode() protected PDFieldTreeNode(PDNonTerminalField parent) public PDNonTerminalField() public PDNonTerminalField(PDNonTerminalField parent) public PDField() public PDField(PDNonTerminalField parent) {code} The PDFieldTreeNode class would expose only the properties which can be inherited, it will also include the field inheritance logic, which will lookup the given key on it's parent when it does not have the value locally. The PDNonTerminalField class would contain little code, and exist mostly just to be a concrete implementation of PDFieldTreeNode. The PDField class will contain just the code for those extra properties supported by terminal fields. was (Author: jahewson): [~msahyoun], I'm not sure that anyone was suggesting that PDField should inherit from PDNonTerminalField, as you say that wouldn't be correct. Here's what I had in mind: if we want to preserve the existing notion that a PDField represents an actual field (i.e. a non-terminal field) then we could use a class hierarchy like that below: {code} abstract class PDFieldTreeNode class PDNonTerminalField extends PDFieldTreeNode class PDField extends PDFieldTreeNode {code} And the following constructors: {code} protected PDFieldTreeNode() protected PDFieldTreeNode(PDNonTerminalField parent) public PDNonTerminalField() public PDNonTerminalField(PDNonTerminalField parent) public PDField() public PDField(PDNonTerminalField parent) {code} The PDFieldTreeNode class would expose only the properties which can be inherited. The PDNonTerminalField class would contain little code, and exist mostly just to be a concrete implementation of PDFieldTreeNode. The PDField class will contain just the code for those extra properties supported by terminal fields. The field inheritance logic will be contained exclusively in PDNonTerminalField, which will lookup the given key on it's parent when it does not have the value locally. Extremely long hang during getFields() on a few PDF files - Key: PDFBOX-2261 URL: https://issues.apache.org/jira/browse/PDFBOX-2261 Project: PDFBox Issue Type: Bug Components: AcroForm Affects Versions: 1.8.6 Reporter: Tim Allison Assignee: Andreas Lehmkühler Priority: Minor Fix For: 2.0.0 Attachments: 966679.pdf, RadioButtons.pdf, screenshot-pdfdebugger.png When I run oap.examples.fdf.PrintFields from trunk, the code seems to hang during acroForm.getFields(). This is a heavy load hang. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (PDFBOX-2261) Extremely long hang during getFields() on a few PDF files
[ https://issues.apache.org/jira/browse/PDFBOX-2261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14096185#comment-14096185 ] Maruan Sahyoun commented on PDFBOX-2261: [~jahewson] I like that suggestion. One question. If I understand the current model correctly PDField doesn’t represent an actual field but it’s subclasses so instead of {code} class PDField extends PDFieldTreeNode {code} it will be {code} abstract class PDField extends PDFieldTreeNode {code} Extremely long hang during getFields() on a few PDF files - Key: PDFBOX-2261 URL: https://issues.apache.org/jira/browse/PDFBOX-2261 Project: PDFBox Issue Type: Bug Components: AcroForm Affects Versions: 1.8.6 Reporter: Tim Allison Assignee: Andreas Lehmkühler Priority: Minor Fix For: 2.0.0 Attachments: 966679.pdf, RadioButtons.pdf, screenshot-pdfdebugger.png When I run oap.examples.fdf.PrintFields from trunk, the code seems to hang during acroForm.getFields(). This is a heavy load hang. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (PDFBOX-1511) pdfMerger App produces Garbage
[ https://issues.apache.org/jira/browse/PDFBOX-1511?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14096208#comment-14096208 ] Maruan Sahyoun commented on PDFBOX-1511: If I understand the spec correctly {quote} Resources (Required; inheritable) A dictionary containing any resources required by the page (see 7.8.3, Resource Dictionaries). If the page requires no resources, the value of this entry shall be an empty dictionary. Omitting the entry entirely indicates that the resources shall be inherited from an ancestor node in the page tree. {quote} for a specific page it has either it’s own resources, uses ancestor resources or none but there is no mix. pdfMerger App produces Garbage -- Key: PDFBOX-1511 URL: https://issues.apache.org/jira/browse/PDFBOX-1511 Project: PDFBox Issue Type: Bug Components: Utilities Affects Versions: 1.7.1 Environment: Win XP; Windows Server 2008 R2; java version 1.6.0_21, Reporter: Michael Huber Fix For: 1.8.7, 2.0.0 Attachments: 078117u1.pdf, 078117u2.pdf, 078118.pdf, 1.pdf, 2.pdf, PDFMergerUtility.java, PDFMergerUtility.java.diff, PdfRenderer.java, targetPdfMergeJava.pdf, targetPdfMergeUtilityApp.pdf pdfbox Utility pdfMerger produces a merged document containing garbage. All merged pdf files are contained but Strings are destroyed. The source pdf files are created with graphviz and are readable without error or disturbance both with Acrobat X and pdfbox pdfDebug Utility. Another astounding thing is that a handcoded merger using pdfMergerUtility class works fine when run within Eclipse Juno and creates same garbage when run from cmd line (pls. see attached source PdfRenderer.java) I checked everything that comes in mind to find the differences, e.g. Java version, encoding/codepage issues, memory settings, found nothing. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (PDFBOX-2261) Extremely long hang during getFields() on a few PDF files
[ https://issues.apache.org/jira/browse/PDFBOX-2261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14096515#comment-14096515 ] John Hewson commented on PDFBOX-2261: - Yes, a PDField's subclasses represent actual fields, PDField's should remain abstract. I skipped over that part. PDField is the base class of all terminal fields. Extremely long hang during getFields() on a few PDF files - Key: PDFBOX-2261 URL: https://issues.apache.org/jira/browse/PDFBOX-2261 Project: PDFBox Issue Type: Bug Components: AcroForm Affects Versions: 1.8.6 Reporter: Tim Allison Assignee: Andreas Lehmkühler Priority: Minor Fix For: 2.0.0 Attachments: 966679.pdf, RadioButtons.pdf, screenshot-pdfdebugger.png When I run oap.examples.fdf.PrintFields from trunk, the code seems to hang during acroForm.getFields(). This is a heavy load hang. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Comment Edited] (PDFBOX-2261) Extremely long hang during getFields() on a few PDF files
[ https://issues.apache.org/jira/browse/PDFBOX-2261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14096515#comment-14096515 ] John Hewson edited comment on PDFBOX-2261 at 8/14/14 3:41 AM: -- Yes, a PDField's subclasses represent actual fields, PDField should remain abstract. I skipped over that part. PDField is the base class of all terminal fields. was (Author: jahewson): Yes, a PDField's subclasses represent actual fields, PDField's should remain abstract. I skipped over that part. PDField is the base class of all terminal fields. Extremely long hang during getFields() on a few PDF files - Key: PDFBOX-2261 URL: https://issues.apache.org/jira/browse/PDFBOX-2261 Project: PDFBox Issue Type: Bug Components: AcroForm Affects Versions: 1.8.6 Reporter: Tim Allison Assignee: Andreas Lehmkühler Priority: Minor Fix For: 2.0.0 Attachments: 966679.pdf, RadioButtons.pdf, screenshot-pdfdebugger.png When I run oap.examples.fdf.PrintFields from trunk, the code seems to hang during acroForm.getFields(). This is a heavy load hang. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Comment Edited] (PDFBOX-2261) Extremely long hang during getFields() on a few PDF files
[ https://issues.apache.org/jira/browse/PDFBOX-2261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14096515#comment-14096515 ] John Hewson edited comment on PDFBOX-2261 at 8/14/14 3:42 AM: -- Yes, a PDField's subclasses represent actual fields, PDField should remain abstract. I skipped over that part. PDField is the base class of all terminal fields, i.e. the superclass of all actual fields. was (Author: jahewson): Yes, a PDField's subclasses represent actual fields, PDField should remain abstract. I skipped over that part. PDField is the base class of all terminal fields. Extremely long hang during getFields() on a few PDF files - Key: PDFBOX-2261 URL: https://issues.apache.org/jira/browse/PDFBOX-2261 Project: PDFBox Issue Type: Bug Components: AcroForm Affects Versions: 1.8.6 Reporter: Tim Allison Assignee: Andreas Lehmkühler Priority: Minor Fix For: 2.0.0 Attachments: 966679.pdf, RadioButtons.pdf, screenshot-pdfdebugger.png When I run oap.examples.fdf.PrintFields from trunk, the code seems to hang during acroForm.getFields(). This is a heavy load hang. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Comment Edited] (PDFBOX-2261) Extremely long hang during getFields() on a few PDF files
[ https://issues.apache.org/jira/browse/PDFBOX-2261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14096515#comment-14096515 ] John Hewson edited comment on PDFBOX-2261 at 8/14/14 3:43 AM: -- Yes, a PDField's subclasses represent actual fields, PDField should remain abstract. I skipped over that part. PDField is the superclass of all terminal fields, i.e. actual fields. was (Author: jahewson): Yes, a PDField's subclasses represent actual fields, PDField should remain abstract. I skipped over that part. PDField is the base class of all terminal fields, i.e. the superclass of all actual fields. Extremely long hang during getFields() on a few PDF files - Key: PDFBOX-2261 URL: https://issues.apache.org/jira/browse/PDFBOX-2261 Project: PDFBox Issue Type: Bug Components: AcroForm Affects Versions: 1.8.6 Reporter: Tim Allison Assignee: Andreas Lehmkühler Priority: Minor Fix For: 2.0.0 Attachments: 966679.pdf, RadioButtons.pdf, screenshot-pdfdebugger.png When I run oap.examples.fdf.PrintFields from trunk, the code seems to hang during acroForm.getFields(). This is a heavy load hang. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Comment Edited] (PDFBOX-2261) Extremely long hang during getFields() on a few PDF files
[ https://issues.apache.org/jira/browse/PDFBOX-2261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14096515#comment-14096515 ] John Hewson edited comment on PDFBOX-2261 at 8/14/14 3:44 AM: -- Yes, a PDField's subclasses represent actual fields, PDField should remain abstract. I skipped over that part. PDField is the superclass of all terminal fields, i.e. actual fields. As you say, it will be: {code} abstract class PDField extends PDFieldTreeNode {code} was (Author: jahewson): Yes, a PDField's subclasses represent actual fields, PDField should remain abstract. I skipped over that part. PDField is the superclass of all terminal fields, i.e. actual fields. Extremely long hang during getFields() on a few PDF files - Key: PDFBOX-2261 URL: https://issues.apache.org/jira/browse/PDFBOX-2261 Project: PDFBox Issue Type: Bug Components: AcroForm Affects Versions: 1.8.6 Reporter: Tim Allison Assignee: Andreas Lehmkühler Priority: Minor Fix For: 2.0.0 Attachments: 966679.pdf, RadioButtons.pdf, screenshot-pdfdebugger.png When I run oap.examples.fdf.PrintFields from trunk, the code seems to hang during acroForm.getFields(). This is a heavy load hang. -- This message was sent by Atlassian JIRA (v6.2#6252)