[ https://issues.apache.org/jira/browse/PDFBOX-3142?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15033542#comment-15033542 ]
Andreas Lehmkühler commented on PDFBOX-3142: -------------------------------------------- Sounds like those pdfs are malformed and the non-sequential parser isn't able to repair them. Did you ever give 2.0.0 a try? It contains a lot of improvements and bugfixes and not all them are/will be backported to 1.8.x The second RC is availbal through the download page > PDFMergerUtility with scratch file generates result with blank pages for > certain source files. > ---------------------------------------------------------------------------------------------- > > Key: PDFBOX-3142 > URL: https://issues.apache.org/jira/browse/PDFBOX-3142 > Project: PDFBox > Issue Type: Bug > Components: Utilities > Affects Versions: 1.8.10 > Environment: Ubuntu 14.04.3, java 1.8.0_66 > Reporter: Jim deVos > > My team uses PDFMergerUtility to attach cover pages to various pdfs . We > recently we tried utilizing a scratch file (e.g. > PDFMergerUtility.mergeDocumentsNonSeq()) to cut down on the amount of RAM we > are using. This approach works for the majority of pdf's in our system, but > some files cause the merger utility to generate resultant pdf's with a blank > page. Specifically, the result pdf contains a blank page after the coverpage > instead of the first page of the second document sent to merger utility. > Whenever this problem occurs, we see the following line in our logs: > {{org.apache.pdfbox.pdfparser.NonSequentialPDFParser - Can't find the object > 52 0 (origin offset 7187557)}} > I'll try to attach/link an example pdf soon, but currently I don't have > permission to redistribute any files that exhibit the problem. However, > here's a simple snippet that replicates the problem - it's pretty > straightforward. > {code} > @Test > public void testMergeNonSeq() throws IOException, COSVisitorException { > destinationPdf = new File(TMP_FOLDER, "result-nonseq.pdf"); > PDFMergerUtility ut = new PDFMergerUtility(); > RandomAccess ram = new > RandomAccessFile(File.createTempFile("mergeram", ".bin"), "rw"); > ut.addSource(coverpagePdf); > ut.addSource(documentPdf); > ut.setDestinationFileName(destinationPdf.getCanonicalPath()); > ut.mergeDocumentsNonSeq(ram); > > //the only automated way we have to tell that something went wrong is > to check the size of the result > assertThat("destination pdf should be larger than the original pdf", > destinationPdf.length(), is( greaterThan(documentPdf.length()))); > } > {code} > Note we only see this problem with PDFMergerUtility.mergeDocumentsNonSeq(). > Using PDFMergerUtility.mergeDocuments() does not exhibit any problems. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org