[jira] [Commented] (PDFBOX-1511) pdfMerger App produces Garbage
[ https://issues.apache.org/jira/browse/PDFBOX-1511?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14126819#comment-14126819 ] Andreas Lehmkühler commented on PDFBOX-1511: Are we done here? pdfMerger App produces Garbage -- Key: PDFBOX-1511 URL: https://issues.apache.org/jira/browse/PDFBOX-1511 Project: PDFBox Issue Type: Bug Components: Utilities Affects Versions: 1.7.1 Environment: Win XP; Windows Server 2008 R2; java version 1.6.0_21, Reporter: Michael Huber Fix For: 1.8.7, 2.0.0 Attachments: 078117u1.pdf, 078117u2.pdf, 078118.pdf, 1.pdf, 2.pdf, PDFBox.GlobalResourceMergeTest.Doc01.decoded.pdf, PDFBox.GlobalResourceMergeTest.Doc02.decoded.pdf, PDFBox.GlobalResourceMergeTest.Merged-1.8.6.pdf, PDFBox.GlobalResourceMergeTest.Merged-1.8.7.pdf, PDFMergerUtility.java, PDFMergerUtility.java.diff, PdfRenderer.java, targetPdfMergeJava.pdf, targetPdfMergeUtilityApp.pdf pdfbox Utility pdfMerger produces a merged document containing garbage. All merged pdf files are contained but Strings are destroyed. The source pdf files are created with graphviz and are readable without error or disturbance both with Acrobat X and pdfbox pdfDebug Utility. Another astounding thing is that a handcoded merger using pdfMergerUtility class works fine when run within Eclipse Juno and creates same garbage when run from cmd line (pls. see attached source PdfRenderer.java) I checked everything that comes in mind to find the differences, e.g. Java version, encoding/codepage issues, memory settings, found nothing. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (PDFBOX-1511) pdfMerger App produces Garbage
[ https://issues.apache.org/jira/browse/PDFBOX-1511?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14126855#comment-14126855 ] Tilman Hausherr commented on PDFBOX-1511: - Yes, although we could still add more tests per Maruans comment on August 16 but this can be added later. pdfMerger App produces Garbage -- Key: PDFBOX-1511 URL: https://issues.apache.org/jira/browse/PDFBOX-1511 Project: PDFBox Issue Type: Bug Components: Utilities Affects Versions: 1.7.1 Environment: Win XP; Windows Server 2008 R2; java version 1.6.0_21, Reporter: Michael Huber Fix For: 1.8.7, 2.0.0 Attachments: 078117u1.pdf, 078117u2.pdf, 078118.pdf, 1.pdf, 2.pdf, PDFBox.GlobalResourceMergeTest.Doc01.decoded.pdf, PDFBox.GlobalResourceMergeTest.Doc02.decoded.pdf, PDFBox.GlobalResourceMergeTest.Merged-1.8.6.pdf, PDFBox.GlobalResourceMergeTest.Merged-1.8.7.pdf, PDFMergerUtility.java, PDFMergerUtility.java.diff, PdfRenderer.java, targetPdfMergeJava.pdf, targetPdfMergeUtilityApp.pdf pdfbox Utility pdfMerger produces a merged document containing garbage. All merged pdf files are contained but Strings are destroyed. The source pdf files are created with graphviz and are readable without error or disturbance both with Acrobat X and pdfbox pdfDebug Utility. Another astounding thing is that a handcoded merger using pdfMergerUtility class works fine when run within Eclipse Juno and creates same garbage when run from cmd line (pls. see attached source PdfRenderer.java) I checked everything that comes in mind to find the differences, e.g. Java version, encoding/codepage issues, memory settings, found nothing. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (PDFBOX-1511) pdfMerger App produces Garbage
[ https://issues.apache.org/jira/browse/PDFBOX-1511?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14099573#comment-14099573 ] Tilman Hausherr commented on PDFBOX-1511: - One can never have too many tests :-) The ones I created for the XImage objects have already prevented me from creating regressions several times. pdfMerger App produces Garbage -- Key: PDFBOX-1511 URL: https://issues.apache.org/jira/browse/PDFBOX-1511 Project: PDFBox Issue Type: Bug Components: Utilities Affects Versions: 1.7.1 Environment: Win XP; Windows Server 2008 R2; java version 1.6.0_21, Reporter: Michael Huber Fix For: 1.8.7, 2.0.0 Attachments: 078117u1.pdf, 078117u2.pdf, 078118.pdf, 1.pdf, 2.pdf, PDFBox.GlobalResourceMergeTest.Doc01.decoded.pdf, PDFBox.GlobalResourceMergeTest.Doc02.decoded.pdf, PDFBox.GlobalResourceMergeTest.Merged-1.8.6.pdf, PDFBox.GlobalResourceMergeTest.Merged-1.8.7.pdf, PDFMergerUtility.java, PDFMergerUtility.java.diff, PdfRenderer.java, targetPdfMergeJava.pdf, targetPdfMergeUtilityApp.pdf pdfbox Utility pdfMerger produces a merged document containing garbage. All merged pdf files are contained but Strings are destroyed. The source pdf files are created with graphviz and are readable without error or disturbance both with Acrobat X and pdfbox pdfDebug Utility. Another astounding thing is that a handcoded merger using pdfMergerUtility class works fine when run within Eclipse Juno and creates same garbage when run from cmd line (pls. see attached source PdfRenderer.java) I checked everything that comes in mind to find the differences, e.g. Java version, encoding/codepage issues, memory settings, found nothing. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (PDFBOX-1511) pdfMerger App produces Garbage
[ https://issues.apache.org/jira/browse/PDFBOX-1511?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14099596#comment-14099596 ] ASF subversion and git services commented on PDFBOX-1511: - Commit 1618330 from [~tilman] in branch 'pdfbox/branches/1.8' [ https://svn.apache.org/r1618330 ] PDFBOX-1511: Test for PDFMergerUtility, thanks Maruan Sahyoun for the PDF test files pdfMerger App produces Garbage -- Key: PDFBOX-1511 URL: https://issues.apache.org/jira/browse/PDFBOX-1511 Project: PDFBox Issue Type: Bug Components: Utilities Affects Versions: 1.7.1 Environment: Win XP; Windows Server 2008 R2; java version 1.6.0_21, Reporter: Michael Huber Fix For: 1.8.7, 2.0.0 Attachments: 078117u1.pdf, 078117u2.pdf, 078118.pdf, 1.pdf, 2.pdf, PDFBox.GlobalResourceMergeTest.Doc01.decoded.pdf, PDFBox.GlobalResourceMergeTest.Doc02.decoded.pdf, PDFBox.GlobalResourceMergeTest.Merged-1.8.6.pdf, PDFBox.GlobalResourceMergeTest.Merged-1.8.7.pdf, PDFMergerUtility.java, PDFMergerUtility.java.diff, PdfRenderer.java, targetPdfMergeJava.pdf, targetPdfMergeUtilityApp.pdf pdfbox Utility pdfMerger produces a merged document containing garbage. All merged pdf files are contained but Strings are destroyed. The source pdf files are created with graphviz and are readable without error or disturbance both with Acrobat X and pdfbox pdfDebug Utility. Another astounding thing is that a handcoded merger using pdfMergerUtility class works fine when run within Eclipse Juno and creates same garbage when run from cmd line (pls. see attached source PdfRenderer.java) I checked everything that comes in mind to find the differences, e.g. Java version, encoding/codepage issues, memory settings, found nothing. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (PDFBOX-1511) pdfMerger App produces Garbage
[ https://issues.apache.org/jira/browse/PDFBOX-1511?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14099604#comment-14099604 ] ASF subversion and git services commented on PDFBOX-1511: - Commit 1618335 from [~tilman] in branch 'pdfbox/branches/1.8' [ https://svn.apache.org/r1618335 ] PDFBOX-1511: corrected revision in comment pdfMerger App produces Garbage -- Key: PDFBOX-1511 URL: https://issues.apache.org/jira/browse/PDFBOX-1511 Project: PDFBox Issue Type: Bug Components: Utilities Affects Versions: 1.7.1 Environment: Win XP; Windows Server 2008 R2; java version 1.6.0_21, Reporter: Michael Huber Fix For: 1.8.7, 2.0.0 Attachments: 078117u1.pdf, 078117u2.pdf, 078118.pdf, 1.pdf, 2.pdf, PDFBox.GlobalResourceMergeTest.Doc01.decoded.pdf, PDFBox.GlobalResourceMergeTest.Doc02.decoded.pdf, PDFBox.GlobalResourceMergeTest.Merged-1.8.6.pdf, PDFBox.GlobalResourceMergeTest.Merged-1.8.7.pdf, PDFMergerUtility.java, PDFMergerUtility.java.diff, PdfRenderer.java, targetPdfMergeJava.pdf, targetPdfMergeUtilityApp.pdf pdfbox Utility pdfMerger produces a merged document containing garbage. All merged pdf files are contained but Strings are destroyed. The source pdf files are created with graphviz and are readable without error or disturbance both with Acrobat X and pdfbox pdfDebug Utility. Another astounding thing is that a handcoded merger using pdfMergerUtility class works fine when run within Eclipse Juno and creates same garbage when run from cmd line (pls. see attached source PdfRenderer.java) I checked everything that comes in mind to find the differences, e.g. Java version, encoding/codepage issues, memory settings, found nothing. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (PDFBOX-1511) pdfMerger App produces Garbage
[ https://issues.apache.org/jira/browse/PDFBOX-1511?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14099085#comment-14099085 ] Maruan Sahyoun commented on PDFBOX-1511: There is a small issue with the .Doc02.pdf file which has a wrong length for a stream maybe because of my editor. Works anyway. I’ll try to fix it so the testfiles are clean. Thought I share it anyway to validate that the issue is fixed. pdfMerger App produces Garbage -- Key: PDFBOX-1511 URL: https://issues.apache.org/jira/browse/PDFBOX-1511 Project: PDFBox Issue Type: Bug Components: Utilities Affects Versions: 1.7.1 Environment: Win XP; Windows Server 2008 R2; java version 1.6.0_21, Reporter: Michael Huber Fix For: 1.8.7, 2.0.0 Attachments: 078117u1.pdf, 078117u2.pdf, 078118.pdf, 1.pdf, 2.pdf, PDFBox.GlobalResourceMergeTest.Doc01.decoded.pdf, PDFBox.GlobalResourceMergeTest.Doc02.decoded.pdf, PDFBox.GlobalResourceMergeTest.Merged-1.8.6.pdf, PDFBox.GlobalResourceMergeTest.Merged-1.8.7.pdf, PDFMergerUtility.java, PDFMergerUtility.java.diff, PdfRenderer.java, targetPdfMergeJava.pdf, targetPdfMergeUtilityApp.pdf pdfbox Utility pdfMerger produces a merged document containing garbage. All merged pdf files are contained but Strings are destroyed. The source pdf files are created with graphviz and are readable without error or disturbance both with Acrobat X and pdfbox pdfDebug Utility. Another astounding thing is that a handcoded merger using pdfMergerUtility class works fine when run within Eclipse Juno and creates same garbage when run from cmd line (pls. see attached source PdfRenderer.java) I checked everything that comes in mind to find the differences, e.g. Java version, encoding/codepage issues, memory settings, found nothing. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (PDFBOX-1511) pdfMerger App produces Garbage
[ https://issues.apache.org/jira/browse/PDFBOX-1511?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14099182#comment-14099182 ] Tilman Hausherr commented on PDFBOX-1511: - Thanks, I will fix or recreate the 2nd file, then write the test and test the test this weekend. And yes, the merge succeeds with the new version and looks bad with the old version, so it is what we need :-) pdfMerger App produces Garbage -- Key: PDFBOX-1511 URL: https://issues.apache.org/jira/browse/PDFBOX-1511 Project: PDFBox Issue Type: Bug Components: Utilities Affects Versions: 1.7.1 Environment: Win XP; Windows Server 2008 R2; java version 1.6.0_21, Reporter: Michael Huber Fix For: 1.8.7, 2.0.0 Attachments: 078117u1.pdf, 078117u2.pdf, 078118.pdf, 1.pdf, 2.pdf, PDFBox.GlobalResourceMergeTest.Doc01.decoded.pdf, PDFBox.GlobalResourceMergeTest.Doc02.decoded.pdf, PDFBox.GlobalResourceMergeTest.Merged-1.8.6.pdf, PDFBox.GlobalResourceMergeTest.Merged-1.8.7.pdf, PDFMergerUtility.java, PDFMergerUtility.java.diff, PdfRenderer.java, targetPdfMergeJava.pdf, targetPdfMergeUtilityApp.pdf pdfbox Utility pdfMerger produces a merged document containing garbage. All merged pdf files are contained but Strings are destroyed. The source pdf files are created with graphviz and are readable without error or disturbance both with Acrobat X and pdfbox pdfDebug Utility. Another astounding thing is that a handcoded merger using pdfMergerUtility class works fine when run within Eclipse Juno and creates same garbage when run from cmd line (pls. see attached source PdfRenderer.java) I checked everything that comes in mind to find the differences, e.g. Java version, encoding/codepage issues, memory settings, found nothing. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (PDFBOX-1511) pdfMerger App produces Garbage
[ https://issues.apache.org/jira/browse/PDFBOX-1511?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14096670#comment-14096670 ] Tilman Hausherr commented on PDFBOX-1511: - I'll do it tonight. pdfMerger App produces Garbage -- Key: PDFBOX-1511 URL: https://issues.apache.org/jira/browse/PDFBOX-1511 Project: PDFBox Issue Type: Bug Components: Utilities Affects Versions: 1.7.1 Environment: Win XP; Windows Server 2008 R2; java version 1.6.0_21, Reporter: Michael Huber Fix For: 1.8.7, 2.0.0 Attachments: 078117u1.pdf, 078117u2.pdf, 078118.pdf, 1.pdf, 2.pdf, PDFMergerUtility.java, PDFMergerUtility.java.diff, PdfRenderer.java, targetPdfMergeJava.pdf, targetPdfMergeUtilityApp.pdf pdfbox Utility pdfMerger produces a merged document containing garbage. All merged pdf files are contained but Strings are destroyed. The source pdf files are created with graphviz and are readable without error or disturbance both with Acrobat X and pdfbox pdfDebug Utility. Another astounding thing is that a handcoded merger using pdfMergerUtility class works fine when run within Eclipse Juno and creates same garbage when run from cmd line (pls. see attached source PdfRenderer.java) I checked everything that comes in mind to find the differences, e.g. Java version, encoding/codepage issues, memory settings, found nothing. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (PDFBOX-1511) pdfMerger App produces Garbage
[ https://issues.apache.org/jira/browse/PDFBOX-1511?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14097173#comment-14097173 ] ASF subversion and git services commented on PDFBOX-1511: - Commit 1617990 from [~tilman] in branch 'pdfbox/branches/1.8' [ https://svn.apache.org/r1617990 ] PDFBOX-1511: don't share resources between different source files, as suggested by Kirk Haines pdfMerger App produces Garbage -- Key: PDFBOX-1511 URL: https://issues.apache.org/jira/browse/PDFBOX-1511 Project: PDFBox Issue Type: Bug Components: Utilities Affects Versions: 1.7.1 Environment: Win XP; Windows Server 2008 R2; java version 1.6.0_21, Reporter: Michael Huber Fix For: 1.8.7, 2.0.0 Attachments: 078117u1.pdf, 078117u2.pdf, 078118.pdf, 1.pdf, 2.pdf, PDFMergerUtility.java, PDFMergerUtility.java.diff, PdfRenderer.java, targetPdfMergeJava.pdf, targetPdfMergeUtilityApp.pdf pdfbox Utility pdfMerger produces a merged document containing garbage. All merged pdf files are contained but Strings are destroyed. The source pdf files are created with graphviz and are readable without error or disturbance both with Acrobat X and pdfbox pdfDebug Utility. Another astounding thing is that a handcoded merger using pdfMergerUtility class works fine when run within Eclipse Juno and creates same garbage when run from cmd line (pls. see attached source PdfRenderer.java) I checked everything that comes in mind to find the differences, e.g. Java version, encoding/codepage issues, memory settings, found nothing. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (PDFBOX-1511) pdfMerger App produces Garbage
[ https://issues.apache.org/jira/browse/PDFBOX-1511?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14097192#comment-14097192 ] Maruan Sahyoun commented on PDFBOX-1511: [~tilman] Just to ensure that I understand you correctly. You mean the files according to {quote} - a file with global resources - to create an uncompressed copy where we shuffle the names of the resources - merge it and see what happens. {quote} correct? pdfMerger App produces Garbage -- Key: PDFBOX-1511 URL: https://issues.apache.org/jira/browse/PDFBOX-1511 Project: PDFBox Issue Type: Bug Components: Utilities Affects Versions: 1.7.1 Environment: Win XP; Windows Server 2008 R2; java version 1.6.0_21, Reporter: Michael Huber Fix For: 1.8.7, 2.0.0 Attachments: 078117u1.pdf, 078117u2.pdf, 078118.pdf, 1.pdf, 2.pdf, PDFMergerUtility.java, PDFMergerUtility.java.diff, PdfRenderer.java, targetPdfMergeJava.pdf, targetPdfMergeUtilityApp.pdf pdfbox Utility pdfMerger produces a merged document containing garbage. All merged pdf files are contained but Strings are destroyed. The source pdf files are created with graphviz and are readable without error or disturbance both with Acrobat X and pdfbox pdfDebug Utility. Another astounding thing is that a handcoded merger using pdfMergerUtility class works fine when run within Eclipse Juno and creates same garbage when run from cmd line (pls. see attached source PdfRenderer.java) I checked everything that comes in mind to find the differences, e.g. Java version, encoding/codepage issues, memory settings, found nothing. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (PDFBOX-1511) pdfMerger App produces Garbage
[ https://issues.apache.org/jira/browse/PDFBOX-1511?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14097200#comment-14097200 ] Tilman Hausherr commented on PDFBOX-1511: - Yes. The resources that we shuffle should be of the same kind, i.e. only fonts, or only images. pdfMerger App produces Garbage -- Key: PDFBOX-1511 URL: https://issues.apache.org/jira/browse/PDFBOX-1511 Project: PDFBox Issue Type: Bug Components: Utilities Affects Versions: 1.7.1 Environment: Win XP; Windows Server 2008 R2; java version 1.6.0_21, Reporter: Michael Huber Fix For: 1.8.7, 2.0.0 Attachments: 078117u1.pdf, 078117u2.pdf, 078118.pdf, 1.pdf, 2.pdf, PDFMergerUtility.java, PDFMergerUtility.java.diff, PdfRenderer.java, targetPdfMergeJava.pdf, targetPdfMergeUtilityApp.pdf pdfbox Utility pdfMerger produces a merged document containing garbage. All merged pdf files are contained but Strings are destroyed. The source pdf files are created with graphviz and are readable without error or disturbance both with Acrobat X and pdfbox pdfDebug Utility. Another astounding thing is that a handcoded merger using pdfMergerUtility class works fine when run within Eclipse Juno and creates same garbage when run from cmd line (pls. see attached source PdfRenderer.java) I checked everything that comes in mind to find the differences, e.g. Java version, encoding/codepage issues, memory settings, found nothing. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (PDFBOX-1511) pdfMerger App produces Garbage
[ https://issues.apache.org/jira/browse/PDFBOX-1511?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14095353#comment-14095353 ] Andreas Lehmkühler commented on PDFBOX-1511: Identically named resources are problematic if 2 or more of the pdfs to be merged are using global resources and if the merger merges the page related resources and the global resources separately as it did befroe the patch. The proposed patch merges by using findResources() instead of getResources() the global and the page specific resources _before_ adding them to the page itself, so that there aren't any duplicted names anymore. I don't know if that was intended in the first place but it solves the problem :-) OTOH pdfs using global resources will grow after merging as all resources are multiplied. But AFAIKT global resources aren't used that often. pdfMerger App produces Garbage -- Key: PDFBOX-1511 URL: https://issues.apache.org/jira/browse/PDFBOX-1511 Project: PDFBox Issue Type: Bug Components: Utilities Affects Versions: 1.7.1 Environment: Win XP; Windows Server 2008 R2; java version 1.6.0_21, Reporter: Michael Huber Fix For: 1.8.7, 2.0.0 Attachments: 1.pdf, 2.pdf, PDFMergerUtility.java, PDFMergerUtility.java.diff, PdfRenderer.java, targetPdfMergeJava.pdf, targetPdfMergeUtilityApp.pdf pdfbox Utility pdfMerger produces a merged document containing garbage. All merged pdf files are contained but Strings are destroyed. The source pdf files are created with graphviz and are readable without error or disturbance both with Acrobat X and pdfbox pdfDebug Utility. Another astounding thing is that a handcoded merger using pdfMergerUtility class works fine when run within Eclipse Juno and creates same garbage when run from cmd line (pls. see attached source PdfRenderer.java) I checked everything that comes in mind to find the differences, e.g. Java version, encoding/codepage issues, memory settings, found nothing. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (PDFBOX-1511) pdfMerger App produces Garbage
[ https://issues.apache.org/jira/browse/PDFBOX-1511?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14095996#comment-14095996 ] Tilman Hausherr commented on PDFBOX-1511: - To verify this we need - a file with global resources - to create an uncompressed copy where we shuffle the names of the resources - merge it and see what happens. I did try it and no mayhem followed, there were no longer global resources in the merged file. However I can't share the file (PDFBOX-2048) but I need one that I can attach it here so that Michael and Kirk can also have a look. Now that GSoC2014 is done and weather is less warm I'll run my tests on the digitalcorpora site until I hit a file with global resources. {code} PDResources globalRes = document.getDocumentCatalog().getPages().getResources(); if (globalRes != null) { System.out.println (global resources size: + globalRes.getXObjects().size()); for (String key : globalRes.getXObjects().keySet()) { System.out.println (global resource: + key); } } else System.out.println (no global resources); {code} pdfMerger App produces Garbage -- Key: PDFBOX-1511 URL: https://issues.apache.org/jira/browse/PDFBOX-1511 Project: PDFBox Issue Type: Bug Components: Utilities Affects Versions: 1.7.1 Environment: Win XP; Windows Server 2008 R2; java version 1.6.0_21, Reporter: Michael Huber Fix For: 1.8.7, 2.0.0 Attachments: 1.pdf, 2.pdf, PDFMergerUtility.java, PDFMergerUtility.java.diff, PdfRenderer.java, targetPdfMergeJava.pdf, targetPdfMergeUtilityApp.pdf pdfbox Utility pdfMerger produces a merged document containing garbage. All merged pdf files are contained but Strings are destroyed. The source pdf files are created with graphviz and are readable without error or disturbance both with Acrobat X and pdfbox pdfDebug Utility. Another astounding thing is that a handcoded merger using pdfMergerUtility class works fine when run within Eclipse Juno and creates same garbage when run from cmd line (pls. see attached source PdfRenderer.java) I checked everything that comes in mind to find the differences, e.g. Java version, encoding/codepage issues, memory settings, found nothing. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (PDFBOX-1511) pdfMerger App produces Garbage
[ https://issues.apache.org/jira/browse/PDFBOX-1511?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14096208#comment-14096208 ] Maruan Sahyoun commented on PDFBOX-1511: If I understand the spec correctly {quote} Resources (Required; inheritable) A dictionary containing any resources required by the page (see 7.8.3, Resource Dictionaries). If the page requires no resources, the value of this entry shall be an empty dictionary. Omitting the entry entirely indicates that the resources shall be inherited from an ancestor node in the page tree. {quote} for a specific page it has either it’s own resources, uses ancestor resources or none but there is no mix. pdfMerger App produces Garbage -- Key: PDFBOX-1511 URL: https://issues.apache.org/jira/browse/PDFBOX-1511 Project: PDFBox Issue Type: Bug Components: Utilities Affects Versions: 1.7.1 Environment: Win XP; Windows Server 2008 R2; java version 1.6.0_21, Reporter: Michael Huber Fix For: 1.8.7, 2.0.0 Attachments: 078117u1.pdf, 078117u2.pdf, 078118.pdf, 1.pdf, 2.pdf, PDFMergerUtility.java, PDFMergerUtility.java.diff, PdfRenderer.java, targetPdfMergeJava.pdf, targetPdfMergeUtilityApp.pdf pdfbox Utility pdfMerger produces a merged document containing garbage. All merged pdf files are contained but Strings are destroyed. The source pdf files are created with graphviz and are readable without error or disturbance both with Acrobat X and pdfbox pdfDebug Utility. Another astounding thing is that a handcoded merger using pdfMergerUtility class works fine when run within Eclipse Juno and creates same garbage when run from cmd line (pls. see attached source PdfRenderer.java) I checked everything that comes in mind to find the differences, e.g. Java version, encoding/codepage issues, memory settings, found nothing. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (PDFBOX-1511) pdfMerger App produces Garbage
[ https://issues.apache.org/jira/browse/PDFBOX-1511?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14072875#comment-14072875 ] Maruan Sahyoun commented on PDFBOX-1511: [~tilman] I could test a modification against a set of files we are using for a customer where we are merging banking documents. I'm not using PDFMergerUtility though. pdfMerger App produces Garbage -- Key: PDFBOX-1511 URL: https://issues.apache.org/jira/browse/PDFBOX-1511 Project: PDFBox Issue Type: Bug Components: Utilities Affects Versions: 1.7.1 Environment: Win XP; Windows Server 2008 R2; java version 1.6.0_21, Reporter: Michael Huber Attachments: 1.pdf, 2.pdf, PDFMergerUtility.java, PDFMergerUtility.java.diff, PdfRenderer.java, targetPdfMergeJava.pdf, targetPdfMergeUtilityApp.pdf pdfbox Utility pdfMerger produces a merged document containing garbage. All merged pdf files are contained but Strings are destroyed. The source pdf files are created with graphviz and are readable without error or disturbance both with Acrobat X and pdfbox pdfDebug Utility. Another astounding thing is that a handcoded merger using pdfMergerUtility class works fine when run within Eclipse Juno and creates same garbage when run from cmd line (pls. see attached source PdfRenderer.java) I checked everything that comes in mind to find the differences, e.g. Java version, encoding/codepage issues, memory settings, found nothing. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (PDFBOX-1511) pdfMerger App produces Garbage
[ https://issues.apache.org/jira/browse/PDFBOX-1511?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14072902#comment-14072902 ] Tilman Hausherr commented on PDFBOX-1511: - Ok, committed for the trunk only as rev 1613017. NOT done for the 1.8 branch yet. pdfMerger App produces Garbage -- Key: PDFBOX-1511 URL: https://issues.apache.org/jira/browse/PDFBOX-1511 Project: PDFBox Issue Type: Bug Components: Utilities Affects Versions: 1.7.1 Environment: Win XP; Windows Server 2008 R2; java version 1.6.0_21, Reporter: Michael Huber Attachments: 1.pdf, 2.pdf, PDFMergerUtility.java, PDFMergerUtility.java.diff, PdfRenderer.java, targetPdfMergeJava.pdf, targetPdfMergeUtilityApp.pdf pdfbox Utility pdfMerger produces a merged document containing garbage. All merged pdf files are contained but Strings are destroyed. The source pdf files are created with graphviz and are readable without error or disturbance both with Acrobat X and pdfbox pdfDebug Utility. Another astounding thing is that a handcoded merger using pdfMergerUtility class works fine when run within Eclipse Juno and creates same garbage when run from cmd line (pls. see attached source PdfRenderer.java) I checked everything that comes in mind to find the differences, e.g. Java version, encoding/codepage issues, memory settings, found nothing. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (PDFBOX-1511) pdfMerger App produces Garbage
[ https://issues.apache.org/jira/browse/PDFBOX-1511?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14072923#comment-14072923 ] Tilman Hausherr commented on PDFBOX-1511: - Michael: even if you're not using the trunk version, you can try the command line app by downloading the latest snapshot here: https://repository.apache.org/content/groups/snapshots/org/apache/pdfbox/pdfbox-app/2.0.0-SNAPSHOT/ pdfMerger App produces Garbage -- Key: PDFBOX-1511 URL: https://issues.apache.org/jira/browse/PDFBOX-1511 Project: PDFBox Issue Type: Bug Components: Utilities Affects Versions: 1.7.1 Environment: Win XP; Windows Server 2008 R2; java version 1.6.0_21, Reporter: Michael Huber Attachments: 1.pdf, 2.pdf, PDFMergerUtility.java, PDFMergerUtility.java.diff, PdfRenderer.java, targetPdfMergeJava.pdf, targetPdfMergeUtilityApp.pdf pdfbox Utility pdfMerger produces a merged document containing garbage. All merged pdf files are contained but Strings are destroyed. The source pdf files are created with graphviz and are readable without error or disturbance both with Acrobat X and pdfbox pdfDebug Utility. Another astounding thing is that a handcoded merger using pdfMergerUtility class works fine when run within Eclipse Juno and creates same garbage when run from cmd line (pls. see attached source PdfRenderer.java) I checked everything that comes in mind to find the differences, e.g. Java version, encoding/codepage issues, memory settings, found nothing. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (PDFBOX-1511) pdfMerger App produces Garbage
[ https://issues.apache.org/jira/browse/PDFBOX-1511?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14073076#comment-14073076 ] Maruan Sahyoun commented on PDFBOX-1511: [~tilman] did some testing using the latests pdfbox-app snapshot as well as an older one prior to the patch and pdfbox-app-1.8.6. Works fine and improves the file size. OTOH with the test files I have all results were OK i.e. Adobe Reader displays them correctly and there are no issues reported by Acrobats Preflight Tool. pdfMerger App produces Garbage -- Key: PDFBOX-1511 URL: https://issues.apache.org/jira/browse/PDFBOX-1511 Project: PDFBox Issue Type: Bug Components: Utilities Affects Versions: 1.7.1 Environment: Win XP; Windows Server 2008 R2; java version 1.6.0_21, Reporter: Michael Huber Attachments: 1.pdf, 2.pdf, PDFMergerUtility.java, PDFMergerUtility.java.diff, PdfRenderer.java, targetPdfMergeJava.pdf, targetPdfMergeUtilityApp.pdf pdfbox Utility pdfMerger produces a merged document containing garbage. All merged pdf files are contained but Strings are destroyed. The source pdf files are created with graphviz and are readable without error or disturbance both with Acrobat X and pdfbox pdfDebug Utility. Another astounding thing is that a handcoded merger using pdfMergerUtility class works fine when run within Eclipse Juno and creates same garbage when run from cmd line (pls. see attached source PdfRenderer.java) I checked everything that comes in mind to find the differences, e.g. Java version, encoding/codepage issues, memory settings, found nothing. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (PDFBOX-1511) pdfMerger App produces Garbage
[ https://issues.apache.org/jira/browse/PDFBOX-1511?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14073371#comment-14073371 ] Tilman Hausherr commented on PDFBOX-1511: - Thanks... what we'd also need, would be a test that fails with the old version and succeeds with the new one: - two files, either created by somebody who has filed a CLA, or with no copyright (e.g. government files) - a java test, i.e. not just human looking at it but some java call that can find whether two identically named resources are different. pdfMerger App produces Garbage -- Key: PDFBOX-1511 URL: https://issues.apache.org/jira/browse/PDFBOX-1511 Project: PDFBox Issue Type: Bug Components: Utilities Affects Versions: 1.7.1 Environment: Win XP; Windows Server 2008 R2; java version 1.6.0_21, Reporter: Michael Huber Attachments: 1.pdf, 2.pdf, PDFMergerUtility.java, PDFMergerUtility.java.diff, PdfRenderer.java, targetPdfMergeJava.pdf, targetPdfMergeUtilityApp.pdf pdfbox Utility pdfMerger produces a merged document containing garbage. All merged pdf files are contained but Strings are destroyed. The source pdf files are created with graphviz and are readable without error or disturbance both with Acrobat X and pdfbox pdfDebug Utility. Another astounding thing is that a handcoded merger using pdfMergerUtility class works fine when run within Eclipse Juno and creates same garbage when run from cmd line (pls. see attached source PdfRenderer.java) I checked everything that comes in mind to find the differences, e.g. Java version, encoding/codepage issues, memory settings, found nothing. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (PDFBOX-1511) pdfMerger App produces Garbage
[ https://issues.apache.org/jira/browse/PDFBOX-1511?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14073376#comment-14073376 ] Maruan Sahyoun commented on PDFBOX-1511: I can create such files but that would need to wait as I’m offline the next two weeks. If someone beats me to it fine, otherwise I’ll take care of it upon my return. pdfMerger App produces Garbage -- Key: PDFBOX-1511 URL: https://issues.apache.org/jira/browse/PDFBOX-1511 Project: PDFBox Issue Type: Bug Components: Utilities Affects Versions: 1.7.1 Environment: Win XP; Windows Server 2008 R2; java version 1.6.0_21, Reporter: Michael Huber Attachments: 1.pdf, 2.pdf, PDFMergerUtility.java, PDFMergerUtility.java.diff, PdfRenderer.java, targetPdfMergeJava.pdf, targetPdfMergeUtilityApp.pdf pdfbox Utility pdfMerger produces a merged document containing garbage. All merged pdf files are contained but Strings are destroyed. The source pdf files are created with graphviz and are readable without error or disturbance both with Acrobat X and pdfbox pdfDebug Utility. Another astounding thing is that a handcoded merger using pdfMergerUtility class works fine when run within Eclipse Juno and creates same garbage when run from cmd line (pls. see attached source PdfRenderer.java) I checked everything that comes in mind to find the differences, e.g. Java version, encoding/codepage issues, memory settings, found nothing. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (PDFBOX-1511) pdfMerger App produces Garbage
[ https://issues.apache.org/jira/browse/PDFBOX-1511?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14073512#comment-14073512 ] Tilman Hausherr commented on PDFBOX-1511: - Thanks... I have set the fix version to 1.8.7 so if we forget (there are currently about 20 new issues per week!), Andreas will catch it before releasing the next version. pdfMerger App produces Garbage -- Key: PDFBOX-1511 URL: https://issues.apache.org/jira/browse/PDFBOX-1511 Project: PDFBox Issue Type: Bug Components: Utilities Affects Versions: 1.7.1 Environment: Win XP; Windows Server 2008 R2; java version 1.6.0_21, Reporter: Michael Huber Fix For: 1.8.7, 2.0.0 Attachments: 1.pdf, 2.pdf, PDFMergerUtility.java, PDFMergerUtility.java.diff, PdfRenderer.java, targetPdfMergeJava.pdf, targetPdfMergeUtilityApp.pdf pdfbox Utility pdfMerger produces a merged document containing garbage. All merged pdf files are contained but Strings are destroyed. The source pdf files are created with graphviz and are readable without error or disturbance both with Acrobat X and pdfbox pdfDebug Utility. Another astounding thing is that a handcoded merger using pdfMergerUtility class works fine when run within Eclipse Juno and creates same garbage when run from cmd line (pls. see attached source PdfRenderer.java) I checked everything that comes in mind to find the differences, e.g. Java version, encoding/codepage issues, memory settings, found nothing. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (PDFBOX-1511) pdfMerger App produces Garbage
[ https://issues.apache.org/jira/browse/PDFBOX-1511?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14072175#comment-14072175 ] Tilman Hausherr commented on PDFBOX-1511: - Although you made a diff to the wrong version, I think I see what would have to be changed. The previous strategy is indeed risky, and I wonder why there haven't been any complaints except here? Many PDF files name their images Im1, Im2, etc. Is there anybody here who does a lot of merging, and could test a modification? pdfMerger App produces Garbage -- Key: PDFBOX-1511 URL: https://issues.apache.org/jira/browse/PDFBOX-1511 Project: PDFBox Issue Type: Bug Components: Utilities Affects Versions: 1.7.1 Environment: Win XP; Windows Server 2008 R2; java version 1.6.0_21, Reporter: Michael Huber Attachments: 1.pdf, 2.pdf, PDFMergerUtility.java, PDFMergerUtility.java.diff, PdfRenderer.java, targetPdfMergeJava.pdf, targetPdfMergeUtilityApp.pdf pdfbox Utility pdfMerger produces a merged document containing garbage. All merged pdf files are contained but Strings are destroyed. The source pdf files are created with graphviz and are readable without error or disturbance both with Acrobat X and pdfbox pdfDebug Utility. Another astoundig thing is that a handcoded merger using pdfMergerUtility class works fine when run within Eclipse Juno and creates same garbage when run from cmd line (pls. see attached source) I checked everything that comes in mind to find the differences, e.g. Java version, encoding/codepage issues, memory settings, found nothing. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (PDFBOX-1511) pdfMerger App produces Garbage
[ https://issues.apache.org/jira/browse/PDFBOX-1511?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14064170#comment-14064170 ] John Hewson commented on PDFBOX-1511: - Hi Kirk, can you attach a patch using svn diff please. pdfMerger App produces Garbage -- Key: PDFBOX-1511 URL: https://issues.apache.org/jira/browse/PDFBOX-1511 Project: PDFBox Issue Type: Bug Components: Utilities Affects Versions: 1.7.1 Environment: Win XP; Windows Server 2008 R2; java version 1.6.0_21, Reporter: Michael Huber Attachments: 1.pdf, 2.pdf, PDFMergerUtility.java, PdfRenderer.java, targetPdfMergeJava.pdf, targetPdfMergeUtilityApp.pdf pdfbox Utility pdfMerger produces a merged document containing garbage. All merged pdf files are contained but Strings are destroyed. The source pdf files are created with graphviz and are readable without error or disturbance both with Acrobat X and pdfbox pdfDebug Utility. Another astoundig thing is that a handcoded merger using pdfMergerUtility class works fine when run within Eclipse Juno and creates same garbage when run from cmd line (pls. see attached source) I checked everything that comes in mind to find the differences, e.g. Java version, encoding/codepage issues, memory settings, found nothing. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (PDFBOX-1511) pdfMerger App produces Garbage
[ https://issues.apache.org/jira/browse/PDFBOX-1511?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13721174#comment-13721174 ] Kirk Haines commented on PDFBOX-1511: - I have also experienced this (Windows 7, Java 1.6.0_35-b10 64-bit) in PDFBox 1.7.1 thru the current trunk. I tried Maruan's suggestion and it resolved the issue, at the expense of creating unnecessary duplicate resources. I had noticed that the corruption in subsequent documents resulted in those pages having their formatting preserved, but the text content had many letters substituted (all 'd' replaced by 'f', all 'y' replaced by 'd', etc.) I also found that the degree of corruption depended on how similar the beginning text content of each input document was. When there was a common header in the documents being merged, there were only a few substitutions. When it was merging a document with itself, there were no errors. When the document header was very different, the resulting text was undecipherable garbage. This made me suspect that it may be a problem with the deflate compression being applied to the stream. I thought that it might be using the (compression) dictionary from the first document and copying the physical bytes from the source document rather than the reading the logical bytes and allowing the deflate filter in the context of the destination document to re-encode them. pdfMerger App produces Garbage -- Key: PDFBOX-1511 URL: https://issues.apache.org/jira/browse/PDFBOX-1511 Project: PDFBox Issue Type: Bug Components: Utilities Affects Versions: 1.7.1 Environment: Win XP; Windows Server 2008 R2; java version 1.6.0_21, Reporter: Michael Huber Attachments: 1.pdf, 2.pdf, PdfRenderer.java, targetPdfMergeJava.pdf, targetPdfMergeUtilityApp.pdf pdfbox Utility pdfMerger produces a merged document containing garbage. All merged pdf files are contained but Strings are destroyed. The source pdf files are created with graphviz and are readable without error or disturbance both with Acrobat X and pdfbox pdfDebug Utility. Another astoundig thing is that a handcoded merger using pdfMergerUtility class works fine when run within Eclipse Juno and creates same garbage when run from cmd line (pls. see attached source) I checked everything that comes in mind to find the differences, e.g. Java version, encoding/codepage issues, memory settings, found nothing. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (PDFBOX-1511) pdfMerger App produces Garbage
[ https://issues.apache.org/jira/browse/PDFBOX-1511?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13573405#comment-13573405 ] Maruan Sahyoun commented on PDFBOX-1511: as a quick hack you can add the line newPage.setResources(new PDResources((COSDictionary) cloner.cloneForNewDocument(srcResources))); in the appendDocument method of PDFMergerUtility around line 410 after newPage.setRotation( page.findRotation() ); As this creates a new resource for each page the resulting PDF file will be large than necessary. pdfMerger App produces Garbage -- Key: PDFBOX-1511 URL: https://issues.apache.org/jira/browse/PDFBOX-1511 Project: PDFBox Issue Type: Bug Components: Utilities Affects Versions: 1.7.1 Environment: Win XP; Windows Server 2008 R2; java version 1.6.0_21, Reporter: Michael Huber Fix For: 1.8.0 Attachments: 1.pdf, 2.pdf, PdfRenderer.java, targetPdfMergeJava.pdf, targetPdfMergeUtilityApp.pdf pdfbox Utility pdfMerger produces a merged document containing garbage. All merged pdf files are contained but Strings are destroyed. The source pdf files are created with graphviz and are readable without error or disturbance both with Acrobat X and pdfbox pdfDebug Utility. Another astoundig thing is that a handcoded merger using pdfMergerUtility class works fine when run within Eclipse Juno and creates same garbage when run from cmd line (pls. see attached source) I checked everything that comes in mind to find the differences, e.g. Java version, encoding/codepage issues, memory settings, found nothing. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (PDFBOX-1511) pdfMerger App produces Garbage
[ https://issues.apache.org/jira/browse/PDFBOX-1511?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13573439#comment-13573439 ] Michael Huber commented on PDFBOX-1511: --- Yep, did it on cmd line. Consequently now it fails when run from Eclipse, the first page is rendered, all following page are inserted with correct size but are blank. Thanks for saving my project time! Kind regards, Michael pdfMerger App produces Garbage -- Key: PDFBOX-1511 URL: https://issues.apache.org/jira/browse/PDFBOX-1511 Project: PDFBox Issue Type: Bug Components: Utilities Affects Versions: 1.7.1 Environment: Win XP; Windows Server 2008 R2; java version 1.6.0_21, Reporter: Michael Huber Fix For: 1.8.0 Attachments: 1.pdf, 2.pdf, PdfRenderer.java, targetPdfMergeJava.pdf, targetPdfMergeUtilityApp.pdf pdfbox Utility pdfMerger produces a merged document containing garbage. All merged pdf files are contained but Strings are destroyed. The source pdf files are created with graphviz and are readable without error or disturbance both with Acrobat X and pdfbox pdfDebug Utility. Another astoundig thing is that a handcoded merger using pdfMergerUtility class works fine when run within Eclipse Juno and creates same garbage when run from cmd line (pls. see attached source) I checked everything that comes in mind to find the differences, e.g. Java version, encoding/codepage issues, memory settings, found nothing. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira