Re: How to proceed?
Hi, I added the information about patching a branch to the CMS site developer section. Maybe it's helpful for other people too :-) Maruan Sahyoun Am 03.05.2013 um 10:50 schrieb Thomas Chojecki i...@rayman2200.de: Am 01.05.2013 19:56, schrieb Andreas Lehmkuehler: Hi, Am 01.05.2013 11:55, schrieb Thomas Chojecki: Seems that I missing some basics knowledge about maintaining branches. I only used branches for bug fix releases without merging. So I would go the way just committing it on the branch and trunk as separate commits without any merging attempts. If this is the wrong way, how do I merge the changes from 2.0.0 into the 1.8.x branch? (like the Oracle JVM error) Also I thought that the trunk and branch would never be merged at the end, so why doing it this way? Let's have a look at the 1.8.1 release. When starting with the preparation the branch contained the 1.8.0 source. I did some cherry picking and chose some of fixes which were done in the trunk. I merged those changes to the 1.8-branch. Based on that code I created the 1.8.1 release. I'm a little bit confused. Maybe we are talking about the very same. :-) I'm using something like this: - checkout the branch - cd to the branch directory - merge some changes from the trunk using svn merge -cREV1,REV2,REV3... https://svn.apache.org/repos/asf/pdfbox/trunk; - commit the changes Thx for that explanation. Didn't know about cherry picking in svn and it makes sense :-) This example helps a lot. So for me it makes no different merging from the branch to the trunk or vice versa. But if we mainly work on the trunk, it makes sense doing the merge from the trunk to the branch. A 1.9 branch would only be needed if we really want to release a new version including improvements based on the current api. Are you planning to do something like that? I'd like to concentrate on the 2.0 and limit the support for the old version to bugfixes and maybe smaller enhancements. Hmm, to be honest, I would do that extra work if this would not be too complicated applying a patch to the branch. I was just wondering if we really need another branch. All bugfixes should go to the 1.8-branch as long as we don't want to release a new feature release. I'm just using the wrong wording. I mean maybe setting the branch to 1.9.0-SNAPSHOT as version. But this is just nice to have and not really necessary. BR Andreas Lehmkühler Best regards Thomas
Build failed in Jenkins: PDFBox-trunk » Apache PDFBox webapp #646
See https://builds.apache.org/job/PDFBox-trunk/org.apache.pdfbox$pdfbox-war/646/ -- [INFO] [INFO] [INFO] Building Apache PDFBox webapp 2.0.0-SNAPSHOT [INFO] [INFO] [INFO] --- maven-clean-plugin:2.4.1:clean (default-clean) @ pdfbox-war ---[INFO] Deleting https://builds.apache.org/job/PDFBox-trunk/org.apache.pdfbox$pdfbox-war/ws/target [INFO] [INFO] --- maven-remote-resources-plugin:1.2.1:process (default) @ pdfbox-war --- [INFO] [INFO] --- maven-resources-plugin:2.5:resources (default-resources) @ pdfbox-war ---[debug] execute contextualize [INFO] Using 'ISO-8859-1' encoding to copy filtered resources. [INFO] skip non existing resourceDirectory https://builds.apache.org/job/PDFBox-trunk/org.apache.pdfbox$pdfbox-war/ws/src/main/resources [INFO] Copying 3 resources [INFO] [INFO] No sources to compile [INFO] --- maven-compiler-plugin:2.3.2:compile (default-compile) @ pdfbox-war --- [INFO] [debug] execute contextualize [INFO] Using 'ISO-8859-1' encoding to copy filtered resources. [INFO] skip non existing resourceDirectory https://builds.apache.org/job/PDFBox-trunk/org.apache.pdfbox$pdfbox-war/ws/src/test/resources [INFO] Copying 3 resources [INFO] --- maven-resources-plugin:2.5:testResources (default-testResources) @ pdfbox-war --- [INFO] [INFO] --- maven-compiler-plugin:2.3.2:testCompile (default-testCompile) @ pdfbox-war ---[INFO] No sources to compile [INFO] [INFO] --- maven-surefire-plugin:2.9:test (default-test) @ pdfbox-war ---[INFO] Surefire report directory: https://builds.apache.org/job/PDFBox-trunk/org.apache.pdfbox$pdfbox-war/ws/target/surefire-reports --- T E S T S --- Results : Tests run: 0, Failures: 0, Errors: 0, Skipped: 0 [JENKINS] Recording test results
Build failed in Jenkins: PDFBox-trunk #646
See https://builds.apache.org/job/PDFBox-trunk/646/ -- [...truncated 750 lines...] [INFO] Building jar: https://builds.apache.org/job/PDFBox-trunk/ws/trunk/lucene/target/pdfbox-lucene-2.0.0-SNAPSHOT.jar [INFO] [INFO] --- maven-site-plugin:3.0:attach-descriptor (attach-descriptor) @ pdfbox-lucene --- [INFO] [INFO] Exclude: release.properties [INFO] --- apache-rat-plugin:0.6:check (default) @ pdfbox-lucene --- [INFO] [INFO] --- maven-install-plugin:2.3.1:install (default-install) @ pdfbox-lucene ---[INFO] Installing https://builds.apache.org/job/PDFBox-trunk/ws/trunk/lucene/target/pdfbox-lucene-2.0.0-SNAPSHOT.jar to /export/home/hudson/hudson-slave/maven-repositories/0/org/apache/pdfbox/pdfbox-lucene/2.0.0-SNAPSHOT/pdfbox-lucene-2.0.0-SNAPSHOT.jar [INFO] Installing https://builds.apache.org/job/PDFBox-trunk/ws/trunk/lucene/pom.xml to /export/home/hudson/hudson-slave/maven-repositories/0/org/apache/pdfbox/pdfbox-lucene/2.0.0-SNAPSHOT/pdfbox-lucene-2.0.0-SNAPSHOT.pom [INFO] [INFO] --- maven-deploy-plugin:2.6:deploy (default-deploy) @ pdfbox-lucene --- Downloading: https://repository.apache.org/content/repositories/snapshots/org/apache/pdfbox/pdfbox-lucene/2.0.0-SNAPSHOT/maven-metadata.xml Downloaded: https://repository.apache.org/content/repositories/snapshots/org/apache/pdfbox/pdfbox-lucene/2.0.0-SNAPSHOT/maven-metadata.xml (780 B at 0.7 KB/sec) Uploading: https://repository.apache.org/content/repositories/snapshots/org/apache/pdfbox/pdfbox-lucene/2.0.0-SNAPSHOT/pdfbox-lucene-2.0.0-20130504.100020-4.jar Uploaded: https://repository.apache.org/content/repositories/snapshots/org/apache/pdfbox/pdfbox-lucene/2.0.0-SNAPSHOT/pdfbox-lucene-2.0.0-20130504.100020-4.jar (17 KB at 10.3 KB/sec) Uploading: https://repository.apache.org/content/repositories/snapshots/org/apache/pdfbox/pdfbox-lucene/2.0.0-SNAPSHOT/pdfbox-lucene-2.0.0-20130504.100020-4.pom Uploaded: https://repository.apache.org/content/repositories/snapshots/org/apache/pdfbox/pdfbox-lucene/2.0.0-SNAPSHOT/pdfbox-lucene-2.0.0-20130504.100020-4.pom (2 KB at 1.1 KB/sec) Downloading: https://repository.apache.org/content/repositories/snapshots/org/apache/pdfbox/pdfbox-lucene/maven-metadata.xml Downloaded: https://repository.apache.org/content/repositories/snapshots/org/apache/pdfbox/pdfbox-lucene/maven-metadata.xml (470 B at 0.5 KB/sec) Uploading: https://repository.apache.org/content/repositories/snapshots/org/apache/pdfbox/pdfbox-lucene/2.0.0-SNAPSHOT/maven-metadata.xml Uploaded: https://repository.apache.org/content/repositories/snapshots/org/apache/pdfbox/pdfbox-lucene/2.0.0-SNAPSHOT/maven-metadata.xml (780 B at 0.5 KB/sec) Uploading: https://repository.apache.org/content/repositories/snapshots/org/apache/pdfbox/pdfbox-lucene/maven-metadata.xml Uploaded: https://repository.apache.org/content/repositories/snapshots/org/apache/pdfbox/pdfbox-lucene/maven-metadata.xml (470 B at 0.2 KB/sec) [INFO] [INFO] [INFO] Building Apache PDFBox for Ant 2.0.0-SNAPSHOT [INFO] [INFO] [INFO] Deleting https://builds.apache.org/job/PDFBox-trunk/ws/trunk/ant/target [INFO] --- maven-clean-plugin:2.4.1:clean (default-clean) @ pdfbox-ant --- [INFO] [INFO] --- maven-remote-resources-plugin:1.2.1:process (default) @ pdfbox-ant --- [INFO] [INFO] --- maven-resources-plugin:2.5:resources (default-resources) @ pdfbox-ant ---[debug] execute contextualize [INFO] Using 'ISO-8859-1' encoding to copy filtered resources. [INFO] skip non existing resourceDirectory https://builds.apache.org/job/PDFBox-trunk/ws/trunk/ant/src/main/resources [INFO] Copying 3 resources [INFO] [INFO] --- maven-compiler-plugin:2.3.2:compile (default-compile) @ pdfbox-ant --- [INFO] Compiling 1 source file to https://builds.apache.org/job/PDFBox-trunk/ws/trunk/ant/target/classes [INFO] [INFO] --- maven-resources-plugin:2.5:testResources (default-testResources) @ pdfbox-ant ---[debug] execute contextualize [INFO] Using 'ISO-8859-1' encoding to copy filtered resources. [INFO] skip non existing resourceDirectory https://builds.apache.org/job/PDFBox-trunk/ws/trunk/ant/src/test/resources [INFO] Copying 3 resources [INFO] [INFO] --- maven-compiler-plugin:2.3.2:testCompile (default-testCompile) @ pdfbox-ant ---[INFO] No sources to compile [INFO] [INFO] --- maven-surefire-plugin:2.9:test (default-test) @ pdfbox-ant ---[INFO] Surefire report directory: https://builds.apache.org/job/PDFBox-trunk/ws/trunk/ant/target/surefire-reports --- T E S T S --- Results : Tests run: 0, Failures: 0, Errors: 0, Skipped: 0 [JENKINS] Recording test results [INFO] [INFO] --- maven-jar-plugin:2.3.1:jar (default-jar) @ pdfbox-ant ---
[jira] [Commented] (PDFBOX-1586) IndexOutOfBoundsException when saving a document (at random)
[ https://issues.apache.org/jira/browse/PDFBOX-1586?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13649108#comment-13649108 ] Andreas Lehmkühler commented on PDFBOX-1586: I removed/reworked the direct access to the scratch file in revision 1479136. Now it should be easier to find the real cause for the described issue and maybe it'll be easier to remove the direct access at all. IndexOutOfBoundsException when saving a document (at random) Key: PDFBOX-1586 URL: https://issues.apache.org/jira/browse/PDFBOX-1586 Project: PDFBox Issue Type: Bug Affects Versions: 1.8.1 Reporter: James Green Priority: Critical Getting the following stacktrace: org.apache.pdfbox.exceptions.COSVisitorException: java.lang.IndexOutOfBoundsException: Index: 28, Size: 0 at org.apache.pdfbox.pdfwriter.COSWriter.visitFromStream(COSWriter.java:1245) at org.apache.pdfbox.cos.COSStream.accept(COSStream.java:201) at org.apache.pdfbox.cos.COSObject.accept(COSObject.java:206) at org.apache.pdfbox.pdfwriter.COSWriter.doWriteObject(COSWriter.java:524) at org.apache.pdfbox.pdfwriter.COSWriter.doWriteBody(COSWriter.java:434) at org.apache.pdfbox.pdfwriter.COSWriter.visitFromDocument(COSWriter.java:1056) at org.apache.pdfbox.cos.COSDocument.accept(COSDocument.java:496) at org.apache.pdfbox.pdfwriter.COSWriter.write(COSWriter.java:1392) at org.apache.pdfbox.pdmodel.PDDocument.save(PDDocument.java:1157) at org.apache.pdfbox.pdmodel.PDDocument.save(PDDocument.java:1138) ... Caused by: java.lang.IndexOutOfBoundsException: Index: 28, Size: 0 at java.util.ArrayList.rangeCheck(ArrayList.java:604) at java.util.ArrayList.get(ArrayList.java:382) at org.apache.pdfbox.io.RandomAccessBuffer.seek(RandomAccessBuffer.java:84) at org.apache.pdfbox.io.RandomAccessFileInputStream.read(RandomAccessFileInputStream.java:96) at java.io.BufferedInputStream.fill(BufferedInputStream.java:235) at java.io.BufferedInputStream.read1(BufferedInputStream.java:275) at java.io.BufferedInputStream.read(BufferedInputStream.java:334) at org.apache.pdfbox.pdfwriter.COSWriter.visitFromStream(COSWriter.java:1232) I'll add some context. We have a data pipeline in which a Windows Print Monitor sends postscript into a servlet which then uses GhostScript 9.05 to convert in-memory to PDF. This PDF is then loaded into PDFBox using PDDocument.load(). At this point we split the original PDF into multiple smaller ones each of which is saved to a ByteArrayOutputStream. At the point of save() we are having serious reliability issues. Taking an original PDF from Ghostscript we have saved this into a unit test to replicate the problem without success. If we attempt to re-execute the pipeline to take the original PDF and split it, we get apparently random percentages of saved documents. For instance, on a 990 page document (text, no images), to be split into 990 1-page documents using Tomcat 7 with -Xmx=512m: Pass 1: 50% were saved, 50% ended with stack traces Pass 2: 100% were saved Pass 3: 100% were saved The same test with -Xmx=128m ended several times with just 1 document saved, the rest were stack traces. We have also seen this randomly hit a sample document consisting of four pages to be split into two two-page documents so it does not appear to be memory related. We also added code to catch the IndexOutOfBoundsException and make up to ten attempts to repeat, but it seems the save() either works the first time or not at all. We're thinking there are environmental factors here but we're now focused on getting this nailed. Any advice or assistance will be welcomed. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Comment Edited] (PDFBOX-1586) IndexOutOfBoundsException when saving a document (at random)
[ https://issues.apache.org/jira/browse/PDFBOX-1586?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13649108#comment-13649108 ] Andreas Lehmkühler edited comment on PDFBOX-1586 at 5/4/13 4:49 PM: I removed/reworked some of the direct accesses to the scratch file in revision 1479136. Now it should be easier to find the real cause for the described issue and maybe it'll be easier to remove the direct access at all. was (Author: lehmi): I removed/reworked the direct access to the scratch file in revision 1479136. Now it should be easier to find the real cause for the described issue and maybe it'll be easier to remove the direct access at all. IndexOutOfBoundsException when saving a document (at random) Key: PDFBOX-1586 URL: https://issues.apache.org/jira/browse/PDFBOX-1586 Project: PDFBox Issue Type: Bug Affects Versions: 1.8.1 Reporter: James Green Priority: Critical Getting the following stacktrace: org.apache.pdfbox.exceptions.COSVisitorException: java.lang.IndexOutOfBoundsException: Index: 28, Size: 0 at org.apache.pdfbox.pdfwriter.COSWriter.visitFromStream(COSWriter.java:1245) at org.apache.pdfbox.cos.COSStream.accept(COSStream.java:201) at org.apache.pdfbox.cos.COSObject.accept(COSObject.java:206) at org.apache.pdfbox.pdfwriter.COSWriter.doWriteObject(COSWriter.java:524) at org.apache.pdfbox.pdfwriter.COSWriter.doWriteBody(COSWriter.java:434) at org.apache.pdfbox.pdfwriter.COSWriter.visitFromDocument(COSWriter.java:1056) at org.apache.pdfbox.cos.COSDocument.accept(COSDocument.java:496) at org.apache.pdfbox.pdfwriter.COSWriter.write(COSWriter.java:1392) at org.apache.pdfbox.pdmodel.PDDocument.save(PDDocument.java:1157) at org.apache.pdfbox.pdmodel.PDDocument.save(PDDocument.java:1138) ... Caused by: java.lang.IndexOutOfBoundsException: Index: 28, Size: 0 at java.util.ArrayList.rangeCheck(ArrayList.java:604) at java.util.ArrayList.get(ArrayList.java:382) at org.apache.pdfbox.io.RandomAccessBuffer.seek(RandomAccessBuffer.java:84) at org.apache.pdfbox.io.RandomAccessFileInputStream.read(RandomAccessFileInputStream.java:96) at java.io.BufferedInputStream.fill(BufferedInputStream.java:235) at java.io.BufferedInputStream.read1(BufferedInputStream.java:275) at java.io.BufferedInputStream.read(BufferedInputStream.java:334) at org.apache.pdfbox.pdfwriter.COSWriter.visitFromStream(COSWriter.java:1232) I'll add some context. We have a data pipeline in which a Windows Print Monitor sends postscript into a servlet which then uses GhostScript 9.05 to convert in-memory to PDF. This PDF is then loaded into PDFBox using PDDocument.load(). At this point we split the original PDF into multiple smaller ones each of which is saved to a ByteArrayOutputStream. At the point of save() we are having serious reliability issues. Taking an original PDF from Ghostscript we have saved this into a unit test to replicate the problem without success. If we attempt to re-execute the pipeline to take the original PDF and split it, we get apparently random percentages of saved documents. For instance, on a 990 page document (text, no images), to be split into 990 1-page documents using Tomcat 7 with -Xmx=512m: Pass 1: 50% were saved, 50% ended with stack traces Pass 2: 100% were saved Pass 3: 100% were saved The same test with -Xmx=128m ended several times with just 1 document saved, the rest were stack traces. We have also seen this randomly hit a sample document consisting of four pages to be split into two two-page documents so it does not appear to be memory related. We also added code to catch the IndexOutOfBoundsException and make up to ten attempts to repeat, but it seems the save() either works the first time or not at all. We're thinking there are environmental factors here but we're now focused on getting this nailed. Any advice or assistance will be welcomed. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (PDFBOX-1586) IndexOutOfBoundsException when saving a document (at random)
[ https://issues.apache.org/jira/browse/PDFBOX-1586?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13649156#comment-13649156 ] James Green commented on PDFBOX-1586: - I looked at this today. I now understand how we experience this problem. We have a class, Divider, which has a number of methods. We'll start at method a() which accepts a ByteArrayInputStream that holds a PDF. a() calls splitter() handing it this ByteArrayInputStream and expecting a list of PDDocuments back. class Divider { public Listbyte[] a(ByteArrayInputStream pdfBytes) { ListPDDocument split = splitter(pdfBytes); Listbyte[] retVal = new ArrayListbyte[](); for (PDDocument p : split) { ByteArrayOutputStream os = new ByteArrayOutputStream(); p.save(os); retVal.add(os.toByteArray()); } return retVal; } public ListPDDocument splitter(ByteArrayInputStream masterPdf) { ListPDDocument retVal = new ArrayListPDDocument(); PDDocument doc = PDDocument.load(masterPdf); ListPDPage pages = doc.getDocumentCatalog().getAllPages(); // Iterate over the pages and import each page into new PDDocuments added to retVal return retVal; } } Because splitter internally creates new PDDocument and performs importPage() referencing the individual pages from the master document, the master document falls of of scope the moment the splitter's work is done. Not unreasonable. The trouble is that importPage passes the new PDPage a reference to the original's scratchFile. So the moment the GC clears up the master document having returned to a() the scratchFile is closed, causing a()'s saving to crash. All blindingly obvious when you realise the scratchFile is being copied inside the importPage routine which most people might expect would perform a clean clone. That hopefully concludes things. We can of course re-work our code to avoid the bug but it would be sensible to make importPage perform a proper clone at some point. IndexOutOfBoundsException when saving a document (at random) Key: PDFBOX-1586 URL: https://issues.apache.org/jira/browse/PDFBOX-1586 Project: PDFBox Issue Type: Bug Affects Versions: 1.8.1 Reporter: James Green Priority: Critical Getting the following stacktrace: org.apache.pdfbox.exceptions.COSVisitorException: java.lang.IndexOutOfBoundsException: Index: 28, Size: 0 at org.apache.pdfbox.pdfwriter.COSWriter.visitFromStream(COSWriter.java:1245) at org.apache.pdfbox.cos.COSStream.accept(COSStream.java:201) at org.apache.pdfbox.cos.COSObject.accept(COSObject.java:206) at org.apache.pdfbox.pdfwriter.COSWriter.doWriteObject(COSWriter.java:524) at org.apache.pdfbox.pdfwriter.COSWriter.doWriteBody(COSWriter.java:434) at org.apache.pdfbox.pdfwriter.COSWriter.visitFromDocument(COSWriter.java:1056) at org.apache.pdfbox.cos.COSDocument.accept(COSDocument.java:496) at org.apache.pdfbox.pdfwriter.COSWriter.write(COSWriter.java:1392) at org.apache.pdfbox.pdmodel.PDDocument.save(PDDocument.java:1157) at org.apache.pdfbox.pdmodel.PDDocument.save(PDDocument.java:1138) ... Caused by: java.lang.IndexOutOfBoundsException: Index: 28, Size: 0 at java.util.ArrayList.rangeCheck(ArrayList.java:604) at java.util.ArrayList.get(ArrayList.java:382) at org.apache.pdfbox.io.RandomAccessBuffer.seek(RandomAccessBuffer.java:84) at org.apache.pdfbox.io.RandomAccessFileInputStream.read(RandomAccessFileInputStream.java:96) at java.io.BufferedInputStream.fill(BufferedInputStream.java:235) at java.io.BufferedInputStream.read1(BufferedInputStream.java:275) at java.io.BufferedInputStream.read(BufferedInputStream.java:334) at org.apache.pdfbox.pdfwriter.COSWriter.visitFromStream(COSWriter.java:1232) I'll add some context. We have a data pipeline in which a Windows Print Monitor sends postscript into a servlet which then uses GhostScript 9.05 to convert in-memory to PDF. This PDF is then loaded into PDFBox using PDDocument.load(). At this point we split the original PDF into multiple smaller ones each of which is saved to a ByteArrayOutputStream. At the point of save() we are having serious reliability issues. Taking an original PDF from Ghostscript we have saved this into a unit test to replicate the problem without success. If we attempt to re-execute the pipeline to take the original PDF and split it, we get apparently random percentages of saved documents. For instance, on a 990 page document (text, no images), to be split into 990 1-page documents using Tomcat 7 with -Xmx=512m: Pass 1: 50% were saved, 50% ended with stack traces Pass 2: 100% were saved Pass 3: 100% were saved The same test with -Xmx=128m ended several times with just 1 document saved, the rest were stack traces. We have