[jira] [Created] (PDFBOX-1511) pdfMerger App as well hand coded Java Merger

2013-02-07 Thread Michael Huber (JIRA)
Michael Huber created PDFBOX-1511:
-

 Summary: pdfMerger App as well hand coded Java Merger 
 Key: PDFBOX-1511
 URL: https://issues.apache.org/jira/browse/PDFBOX-1511
 Project: PDFBox
  Issue Type: Bug
  Components: Utilities
Affects Versions: 1.7.1
 Environment: Win XP; Windows Server 2008 R2; java version 1.6.0_21, 
Reporter: Michael Huber
 Fix For: 1.8.0
 Attachments: 1.pdf, 2.pdf

pdfbox Utility pdfMerger produces a merged document containing garbage. All 
merged pdf files are contained but Strings are destroyed.

The source pdf files are created with graphviz and are readable without error 
or disturbance both with Acrobat X and pdfbox pdfDebug Utility.

Another astoundig thing is that a handcoded merger using pdfMergerUtility class 
works fine when run within Eclipse Juno and creates same garbage when run from 
cmd line (pls. see attached source)

I checked everything that comes in mind to find the differences, e.g. Java 
version, encoding/codepage issues, memory settings, found nothing.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (PDFBOX-1511) pdfMerger App as well hand coded Java Merger

2013-02-07 Thread Michael Huber (JIRA)

 [ 
https://issues.apache.org/jira/browse/PDFBOX-1511?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Huber updated PDFBOX-1511:
--

Attachment: 2.pdf
1.pdf

Source pdf files

 pdfMerger App as well hand coded Java Merger 
 -

 Key: PDFBOX-1511
 URL: https://issues.apache.org/jira/browse/PDFBOX-1511
 Project: PDFBox
  Issue Type: Bug
  Components: Utilities
Affects Versions: 1.7.1
 Environment: Win XP; Windows Server 2008 R2; java version 1.6.0_21, 
Reporter: Michael Huber
 Fix For: 1.8.0

 Attachments: 1.pdf, 2.pdf


 pdfbox Utility pdfMerger produces a merged document containing garbage. All 
 merged pdf files are contained but Strings are destroyed.
 The source pdf files are created with graphviz and are readable without error 
 or disturbance both with Acrobat X and pdfbox pdfDebug Utility.
 Another astoundig thing is that a handcoded merger using pdfMergerUtility 
 class works fine when run within Eclipse Juno and creates same garbage when 
 run from cmd line (pls. see attached source)
 I checked everything that comes in mind to find the differences, e.g. Java 
 version, encoding/codepage issues, memory settings, found nothing.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (PDFBOX-1511) pdfMerger App as well hand coded Java Merger

2013-02-07 Thread Michael Huber (JIRA)

 [ 
https://issues.apache.org/jira/browse/PDFBOX-1511?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Huber updated PDFBOX-1511:
--

Attachment: targetPdfMergeUtilityApp.pdf
targetPdfMergeJava.pdf

The merged files

 pdfMerger App as well hand coded Java Merger 
 -

 Key: PDFBOX-1511
 URL: https://issues.apache.org/jira/browse/PDFBOX-1511
 Project: PDFBox
  Issue Type: Bug
  Components: Utilities
Affects Versions: 1.7.1
 Environment: Win XP; Windows Server 2008 R2; java version 1.6.0_21, 
Reporter: Michael Huber
 Fix For: 1.8.0

 Attachments: 1.pdf, 2.pdf, targetPdfMergeJava.pdf, 
 targetPdfMergeUtilityApp.pdf


 pdfbox Utility pdfMerger produces a merged document containing garbage. All 
 merged pdf files are contained but Strings are destroyed.
 The source pdf files are created with graphviz and are readable without error 
 or disturbance both with Acrobat X and pdfbox pdfDebug Utility.
 Another astoundig thing is that a handcoded merger using pdfMergerUtility 
 class works fine when run within Eclipse Juno and creates same garbage when 
 run from cmd line (pls. see attached source)
 I checked everything that comes in mind to find the differences, e.g. Java 
 version, encoding/codepage issues, memory settings, found nothing.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (PDFBOX-1511) pdfMerger App produces Garbage

2013-02-07 Thread Michael Huber (JIRA)

 [ 
https://issues.apache.org/jira/browse/PDFBOX-1511?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Huber updated PDFBOX-1511:
--

Summary: pdfMerger App produces Garbage  (was: pdfMerger App as well hand 
coded Java Merger )

 pdfMerger App produces Garbage
 --

 Key: PDFBOX-1511
 URL: https://issues.apache.org/jira/browse/PDFBOX-1511
 Project: PDFBox
  Issue Type: Bug
  Components: Utilities
Affects Versions: 1.7.1
 Environment: Win XP; Windows Server 2008 R2; java version 1.6.0_21, 
Reporter: Michael Huber
 Fix For: 1.8.0

 Attachments: 1.pdf, 2.pdf, PdfRenderer.java, targetPdfMergeJava.pdf, 
 targetPdfMergeUtilityApp.pdf


 pdfbox Utility pdfMerger produces a merged document containing garbage. All 
 merged pdf files are contained but Strings are destroyed.
 The source pdf files are created with graphviz and are readable without error 
 or disturbance both with Acrobat X and pdfbox pdfDebug Utility.
 Another astoundig thing is that a handcoded merger using pdfMergerUtility 
 class works fine when run within Eclipse Juno and creates same garbage when 
 run from cmd line (pls. see attached source)
 I checked everything that comes in mind to find the differences, e.g. Java 
 version, encoding/codepage issues, memory settings, found nothing.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (PDFBOX-1512) TextPositionComparator is not compatible with Java 7

2013-02-07 Thread Benjamin Papez (JIRA)
Benjamin Papez created PDFBOX-1512:
--

 Summary: TextPositionComparator is not compatible with Java 7
 Key: PDFBOX-1512
 URL: https://issues.apache.org/jira/browse/PDFBOX-1512
 Project: PDFBox
  Issue Type: Bug
  Components: Text extraction
Affects Versions: 1.7.1
 Environment: Java 7
Reporter: Benjamin Papez


The TextPostionCompartor causes the following exception running on Java 7: 
Unexpected RuntimeException from 
org.apache.tika.parser.ParserDecorator$1@9007fa2 Original cause: Comparison 
method violates its general contract!

I think the problem is with this check:

if ( yDifference  .1 ||
(pos2YBottom = pos1YTop  pos2YBottom = pos1YBottom) ||
(pos1YBottom = pos2YTop  pos1YBottom = pos2YBottom))

as it violates the contract requirement:

The implementor must also ensure that the relation is transitive: ((compare(x, 
y)0)  (compare(y, z)0)) implies compare(x, z)0.

Finally, the implementor must ensure that compare(x, y)==0 implies that 
sgn(compare(x, z))==sgn(compare(y, z)) for all z.

Java 7 now is strict and throws exceptions when the contract is violated.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (PDFBOX-1512) TextPositionComparator is not compatible with Java 7

2013-02-07 Thread Benjamin Papez (JIRA)

 [ 
https://issues.apache.org/jira/browse/PDFBOX-1512?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benjamin Papez updated PDFBOX-1512:
---

Attachment: TextPositionComparator.java

I modified the implementation and tried with the attached file, which no longer 
braught the exception.

 TextPositionComparator is not compatible with Java 7
 

 Key: PDFBOX-1512
 URL: https://issues.apache.org/jira/browse/PDFBOX-1512
 Project: PDFBox
  Issue Type: Bug
  Components: Text extraction
Affects Versions: 1.7.1
 Environment: Java 7
Reporter: Benjamin Papez
 Attachments: TextPositionComparator.java


 The TextPostionCompartor causes the following exception running on Java 7: 
 Unexpected RuntimeException from 
 org.apache.tika.parser.ParserDecorator$1@9007fa2 Original cause: Comparison 
 method violates its general contract!
 I think the problem is with this check:
 if ( yDifference  .1 ||
 (pos2YBottom = pos1YTop  pos2YBottom = pos1YBottom) ||
 (pos1YBottom = pos2YTop  pos1YBottom = pos2YBottom))
 as it violates the contract requirement:
 The implementor must also ensure that the relation is transitive: 
 ((compare(x, y)0)  (compare(y, z)0)) implies compare(x, z)0.
 Finally, the implementor must ensure that compare(x, y)==0 implies that 
 sgn(compare(x, z))==sgn(compare(y, z)) for all z.
 Java 7 now is strict and throws exceptions when the contract is violated.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (PDFBOX-1511) pdfMerger App produces Garbage

2013-02-07 Thread Maruan Sahyoun (JIRA)

[ 
https://issues.apache.org/jira/browse/PDFBOX-1511?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13573405#comment-13573405
 ] 

Maruan Sahyoun commented on PDFBOX-1511:


as a quick hack you can add the line

newPage.setResources(new PDResources((COSDictionary) 
cloner.cloneForNewDocument(srcResources)));

in the appendDocument method of PDFMergerUtility around line 410 after 
newPage.setRotation( page.findRotation() ); As this creates a new resource for 
each page the resulting PDF file will be large than necessary.

 pdfMerger App produces Garbage
 --

 Key: PDFBOX-1511
 URL: https://issues.apache.org/jira/browse/PDFBOX-1511
 Project: PDFBox
  Issue Type: Bug
  Components: Utilities
Affects Versions: 1.7.1
 Environment: Win XP; Windows Server 2008 R2; java version 1.6.0_21, 
Reporter: Michael Huber
 Fix For: 1.8.0

 Attachments: 1.pdf, 2.pdf, PdfRenderer.java, targetPdfMergeJava.pdf, 
 targetPdfMergeUtilityApp.pdf


 pdfbox Utility pdfMerger produces a merged document containing garbage. All 
 merged pdf files are contained but Strings are destroyed.
 The source pdf files are created with graphviz and are readable without error 
 or disturbance both with Acrobat X and pdfbox pdfDebug Utility.
 Another astoundig thing is that a handcoded merger using pdfMergerUtility 
 class works fine when run within Eclipse Juno and creates same garbage when 
 run from cmd line (pls. see attached source)
 I checked everything that comes in mind to find the differences, e.g. Java 
 version, encoding/codepage issues, memory settings, found nothing.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (PDFBOX-1511) pdfMerger App produces Garbage

2013-02-07 Thread Michael Huber (JIRA)

[ 
https://issues.apache.org/jira/browse/PDFBOX-1511?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13573439#comment-13573439
 ] 

Michael Huber commented on PDFBOX-1511:
---

Yep, did it on cmd line.

Consequently now it fails when run from Eclipse, the first page is rendered, 
all following page are inserted with correct size but are blank.

Thanks for saving my project time!

Kind regards,

Michael

 pdfMerger App produces Garbage
 --

 Key: PDFBOX-1511
 URL: https://issues.apache.org/jira/browse/PDFBOX-1511
 Project: PDFBox
  Issue Type: Bug
  Components: Utilities
Affects Versions: 1.7.1
 Environment: Win XP; Windows Server 2008 R2; java version 1.6.0_21, 
Reporter: Michael Huber
 Fix For: 1.8.0

 Attachments: 1.pdf, 2.pdf, PdfRenderer.java, targetPdfMergeJava.pdf, 
 targetPdfMergeUtilityApp.pdf


 pdfbox Utility pdfMerger produces a merged document containing garbage. All 
 merged pdf files are contained but Strings are destroyed.
 The source pdf files are created with graphviz and are readable without error 
 or disturbance both with Acrobat X and pdfbox pdfDebug Utility.
 Another astoundig thing is that a handcoded merger using pdfMergerUtility 
 class works fine when run within Eclipse Juno and creates same garbage when 
 run from cmd line (pls. see attached source)
 I checked everything that comes in mind to find the differences, e.g. Java 
 version, encoding/codepage issues, memory settings, found nothing.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Comment Edited] (PDFBOX-1511) pdfMerger App produces Garbage

2013-02-07 Thread Michael Huber (JIRA)

[ 
https://issues.apache.org/jira/browse/PDFBOX-1511?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13573439#comment-13573439
 ] 

Michael Huber edited comment on PDFBOX-1511 at 2/7/13 12:53 PM:


Yep, did it on cmd line.

Consequently now it fails when run from Eclipse, the first page is rendered, 
all following pages are inserted with correct size but are blank.

Thanks for saving my project time!

Kind regards,

Michael

  was (Author: pdfbox_mike):
Yep, did it on cmd line.

Consequently now it fails when run from Eclipse, the first page is rendered, 
all following page are inserted with correct size but are blank.

Thanks for saving my project time!

Kind regards,

Michael
  
 pdfMerger App produces Garbage
 --

 Key: PDFBOX-1511
 URL: https://issues.apache.org/jira/browse/PDFBOX-1511
 Project: PDFBox
  Issue Type: Bug
  Components: Utilities
Affects Versions: 1.7.1
 Environment: Win XP; Windows Server 2008 R2; java version 1.6.0_21, 
Reporter: Michael Huber
 Fix For: 1.8.0

 Attachments: 1.pdf, 2.pdf, PdfRenderer.java, targetPdfMergeJava.pdf, 
 targetPdfMergeUtilityApp.pdf


 pdfbox Utility pdfMerger produces a merged document containing garbage. All 
 merged pdf files are contained but Strings are destroyed.
 The source pdf files are created with graphviz and are readable without error 
 or disturbance both with Acrobat X and pdfbox pdfDebug Utility.
 Another astoundig thing is that a handcoded merger using pdfMergerUtility 
 class works fine when run within Eclipse Juno and creates same garbage when 
 run from cmd line (pls. see attached source)
 I checked everything that comes in mind to find the differences, e.g. Java 
 version, encoding/codepage issues, memory settings, found nothing.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira