[jira] [Created] (PDFBOX-1511) pdfMerger App as well hand coded Java Merger
Michael Huber created PDFBOX-1511: - Summary: pdfMerger App as well hand coded Java Merger Key: PDFBOX-1511 URL: https://issues.apache.org/jira/browse/PDFBOX-1511 Project: PDFBox Issue Type: Bug Components: Utilities Affects Versions: 1.7.1 Environment: Win XP; Windows Server 2008 R2; java version 1.6.0_21, Reporter: Michael Huber Fix For: 1.8.0 Attachments: 1.pdf, 2.pdf pdfbox Utility pdfMerger produces a merged document containing garbage. All merged pdf files are contained but Strings are destroyed. The source pdf files are created with graphviz and are readable without error or disturbance both with Acrobat X and pdfbox pdfDebug Utility. Another astoundig thing is that a handcoded merger using pdfMergerUtility class works fine when run within Eclipse Juno and creates same garbage when run from cmd line (pls. see attached source) I checked everything that comes in mind to find the differences, e.g. Java version, encoding/codepage issues, memory settings, found nothing. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (PDFBOX-1511) pdfMerger App as well hand coded Java Merger
[ https://issues.apache.org/jira/browse/PDFBOX-1511?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Huber updated PDFBOX-1511: -- Attachment: 2.pdf 1.pdf Source pdf files pdfMerger App as well hand coded Java Merger - Key: PDFBOX-1511 URL: https://issues.apache.org/jira/browse/PDFBOX-1511 Project: PDFBox Issue Type: Bug Components: Utilities Affects Versions: 1.7.1 Environment: Win XP; Windows Server 2008 R2; java version 1.6.0_21, Reporter: Michael Huber Fix For: 1.8.0 Attachments: 1.pdf, 2.pdf pdfbox Utility pdfMerger produces a merged document containing garbage. All merged pdf files are contained but Strings are destroyed. The source pdf files are created with graphviz and are readable without error or disturbance both with Acrobat X and pdfbox pdfDebug Utility. Another astoundig thing is that a handcoded merger using pdfMergerUtility class works fine when run within Eclipse Juno and creates same garbage when run from cmd line (pls. see attached source) I checked everything that comes in mind to find the differences, e.g. Java version, encoding/codepage issues, memory settings, found nothing. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (PDFBOX-1511) pdfMerger App as well hand coded Java Merger
[ https://issues.apache.org/jira/browse/PDFBOX-1511?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Huber updated PDFBOX-1511: -- Attachment: targetPdfMergeUtilityApp.pdf targetPdfMergeJava.pdf The merged files pdfMerger App as well hand coded Java Merger - Key: PDFBOX-1511 URL: https://issues.apache.org/jira/browse/PDFBOX-1511 Project: PDFBox Issue Type: Bug Components: Utilities Affects Versions: 1.7.1 Environment: Win XP; Windows Server 2008 R2; java version 1.6.0_21, Reporter: Michael Huber Fix For: 1.8.0 Attachments: 1.pdf, 2.pdf, targetPdfMergeJava.pdf, targetPdfMergeUtilityApp.pdf pdfbox Utility pdfMerger produces a merged document containing garbage. All merged pdf files are contained but Strings are destroyed. The source pdf files are created with graphviz and are readable without error or disturbance both with Acrobat X and pdfbox pdfDebug Utility. Another astoundig thing is that a handcoded merger using pdfMergerUtility class works fine when run within Eclipse Juno and creates same garbage when run from cmd line (pls. see attached source) I checked everything that comes in mind to find the differences, e.g. Java version, encoding/codepage issues, memory settings, found nothing. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (PDFBOX-1511) pdfMerger App produces Garbage
[ https://issues.apache.org/jira/browse/PDFBOX-1511?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Huber updated PDFBOX-1511: -- Summary: pdfMerger App produces Garbage (was: pdfMerger App as well hand coded Java Merger ) pdfMerger App produces Garbage -- Key: PDFBOX-1511 URL: https://issues.apache.org/jira/browse/PDFBOX-1511 Project: PDFBox Issue Type: Bug Components: Utilities Affects Versions: 1.7.1 Environment: Win XP; Windows Server 2008 R2; java version 1.6.0_21, Reporter: Michael Huber Fix For: 1.8.0 Attachments: 1.pdf, 2.pdf, PdfRenderer.java, targetPdfMergeJava.pdf, targetPdfMergeUtilityApp.pdf pdfbox Utility pdfMerger produces a merged document containing garbage. All merged pdf files are contained but Strings are destroyed. The source pdf files are created with graphviz and are readable without error or disturbance both with Acrobat X and pdfbox pdfDebug Utility. Another astoundig thing is that a handcoded merger using pdfMergerUtility class works fine when run within Eclipse Juno and creates same garbage when run from cmd line (pls. see attached source) I checked everything that comes in mind to find the differences, e.g. Java version, encoding/codepage issues, memory settings, found nothing. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (PDFBOX-1512) TextPositionComparator is not compatible with Java 7
Benjamin Papez created PDFBOX-1512: -- Summary: TextPositionComparator is not compatible with Java 7 Key: PDFBOX-1512 URL: https://issues.apache.org/jira/browse/PDFBOX-1512 Project: PDFBox Issue Type: Bug Components: Text extraction Affects Versions: 1.7.1 Environment: Java 7 Reporter: Benjamin Papez The TextPostionCompartor causes the following exception running on Java 7: Unexpected RuntimeException from org.apache.tika.parser.ParserDecorator$1@9007fa2 Original cause: Comparison method violates its general contract! I think the problem is with this check: if ( yDifference .1 || (pos2YBottom = pos1YTop pos2YBottom = pos1YBottom) || (pos1YBottom = pos2YTop pos1YBottom = pos2YBottom)) as it violates the contract requirement: The implementor must also ensure that the relation is transitive: ((compare(x, y)0) (compare(y, z)0)) implies compare(x, z)0. Finally, the implementor must ensure that compare(x, y)==0 implies that sgn(compare(x, z))==sgn(compare(y, z)) for all z. Java 7 now is strict and throws exceptions when the contract is violated. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (PDFBOX-1512) TextPositionComparator is not compatible with Java 7
[ https://issues.apache.org/jira/browse/PDFBOX-1512?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benjamin Papez updated PDFBOX-1512: --- Attachment: TextPositionComparator.java I modified the implementation and tried with the attached file, which no longer braught the exception. TextPositionComparator is not compatible with Java 7 Key: PDFBOX-1512 URL: https://issues.apache.org/jira/browse/PDFBOX-1512 Project: PDFBox Issue Type: Bug Components: Text extraction Affects Versions: 1.7.1 Environment: Java 7 Reporter: Benjamin Papez Attachments: TextPositionComparator.java The TextPostionCompartor causes the following exception running on Java 7: Unexpected RuntimeException from org.apache.tika.parser.ParserDecorator$1@9007fa2 Original cause: Comparison method violates its general contract! I think the problem is with this check: if ( yDifference .1 || (pos2YBottom = pos1YTop pos2YBottom = pos1YBottom) || (pos1YBottom = pos2YTop pos1YBottom = pos2YBottom)) as it violates the contract requirement: The implementor must also ensure that the relation is transitive: ((compare(x, y)0) (compare(y, z)0)) implies compare(x, z)0. Finally, the implementor must ensure that compare(x, y)==0 implies that sgn(compare(x, z))==sgn(compare(y, z)) for all z. Java 7 now is strict and throws exceptions when the contract is violated. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (PDFBOX-1511) pdfMerger App produces Garbage
[ https://issues.apache.org/jira/browse/PDFBOX-1511?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13573405#comment-13573405 ] Maruan Sahyoun commented on PDFBOX-1511: as a quick hack you can add the line newPage.setResources(new PDResources((COSDictionary) cloner.cloneForNewDocument(srcResources))); in the appendDocument method of PDFMergerUtility around line 410 after newPage.setRotation( page.findRotation() ); As this creates a new resource for each page the resulting PDF file will be large than necessary. pdfMerger App produces Garbage -- Key: PDFBOX-1511 URL: https://issues.apache.org/jira/browse/PDFBOX-1511 Project: PDFBox Issue Type: Bug Components: Utilities Affects Versions: 1.7.1 Environment: Win XP; Windows Server 2008 R2; java version 1.6.0_21, Reporter: Michael Huber Fix For: 1.8.0 Attachments: 1.pdf, 2.pdf, PdfRenderer.java, targetPdfMergeJava.pdf, targetPdfMergeUtilityApp.pdf pdfbox Utility pdfMerger produces a merged document containing garbage. All merged pdf files are contained but Strings are destroyed. The source pdf files are created with graphviz and are readable without error or disturbance both with Acrobat X and pdfbox pdfDebug Utility. Another astoundig thing is that a handcoded merger using pdfMergerUtility class works fine when run within Eclipse Juno and creates same garbage when run from cmd line (pls. see attached source) I checked everything that comes in mind to find the differences, e.g. Java version, encoding/codepage issues, memory settings, found nothing. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (PDFBOX-1511) pdfMerger App produces Garbage
[ https://issues.apache.org/jira/browse/PDFBOX-1511?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13573439#comment-13573439 ] Michael Huber commented on PDFBOX-1511: --- Yep, did it on cmd line. Consequently now it fails when run from Eclipse, the first page is rendered, all following page are inserted with correct size but are blank. Thanks for saving my project time! Kind regards, Michael pdfMerger App produces Garbage -- Key: PDFBOX-1511 URL: https://issues.apache.org/jira/browse/PDFBOX-1511 Project: PDFBox Issue Type: Bug Components: Utilities Affects Versions: 1.7.1 Environment: Win XP; Windows Server 2008 R2; java version 1.6.0_21, Reporter: Michael Huber Fix For: 1.8.0 Attachments: 1.pdf, 2.pdf, PdfRenderer.java, targetPdfMergeJava.pdf, targetPdfMergeUtilityApp.pdf pdfbox Utility pdfMerger produces a merged document containing garbage. All merged pdf files are contained but Strings are destroyed. The source pdf files are created with graphviz and are readable without error or disturbance both with Acrobat X and pdfbox pdfDebug Utility. Another astoundig thing is that a handcoded merger using pdfMergerUtility class works fine when run within Eclipse Juno and creates same garbage when run from cmd line (pls. see attached source) I checked everything that comes in mind to find the differences, e.g. Java version, encoding/codepage issues, memory settings, found nothing. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Comment Edited] (PDFBOX-1511) pdfMerger App produces Garbage
[ https://issues.apache.org/jira/browse/PDFBOX-1511?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13573439#comment-13573439 ] Michael Huber edited comment on PDFBOX-1511 at 2/7/13 12:53 PM: Yep, did it on cmd line. Consequently now it fails when run from Eclipse, the first page is rendered, all following pages are inserted with correct size but are blank. Thanks for saving my project time! Kind regards, Michael was (Author: pdfbox_mike): Yep, did it on cmd line. Consequently now it fails when run from Eclipse, the first page is rendered, all following page are inserted with correct size but are blank. Thanks for saving my project time! Kind regards, Michael pdfMerger App produces Garbage -- Key: PDFBOX-1511 URL: https://issues.apache.org/jira/browse/PDFBOX-1511 Project: PDFBox Issue Type: Bug Components: Utilities Affects Versions: 1.7.1 Environment: Win XP; Windows Server 2008 R2; java version 1.6.0_21, Reporter: Michael Huber Fix For: 1.8.0 Attachments: 1.pdf, 2.pdf, PdfRenderer.java, targetPdfMergeJava.pdf, targetPdfMergeUtilityApp.pdf pdfbox Utility pdfMerger produces a merged document containing garbage. All merged pdf files are contained but Strings are destroyed. The source pdf files are created with graphviz and are readable without error or disturbance both with Acrobat X and pdfbox pdfDebug Utility. Another astoundig thing is that a handcoded merger using pdfMergerUtility class works fine when run within Eclipse Juno and creates same garbage when run from cmd line (pls. see attached source) I checked everything that comes in mind to find the differences, e.g. Java version, encoding/codepage issues, memory settings, found nothing. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira