[
https://issues.apache.org/jira/browse/PDFBOX-3142?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15034433#comment-15034433
]
Jim deVos commented on PDFBOX-3142:
-----------------------------------
Andreas - thanks for your reply. I'll run these source documents through a pdf
validator to see what it finds. Individually they open just fine (i.e. no
blank pages) in various pdf viewers, but I suspect that these viewers are
pretty forgiving w/ non-compliant files. On that note, it would be nice to
know of a way to anticipate if the file will cause these issues before
attempting to merge it with a coverpage. At the moment all I see is the
aforementioned error message in the log, but I don't see a way to interrogate
the parser to see if it has issues w/ the file.
As for v2, that's a good suggestion. I'll rewrite my test for 2.0.0 and report
the results.
> PDFMergerUtility with scratch file generates result with blank pages for
> certain source files.
> ----------------------------------------------------------------------------------------------
>
> Key: PDFBOX-3142
> URL: https://issues.apache.org/jira/browse/PDFBOX-3142
> Project: PDFBox
> Issue Type: Bug
> Components: Utilities
> Affects Versions: 1.8.10
> Environment: Ubuntu 14.04.3, java 1.8.0_66
> Reporter: Jim deVos
>
> My team uses PDFMergerUtility to attach cover pages to various pdfs . We
> recently we tried utilizing a scratch file (e.g.
> PDFMergerUtility.mergeDocumentsNonSeq()) to cut down on the amount of RAM we
> are using. This approach works for the majority of pdf's in our system, but
> some files cause the merger utility to generate resultant pdf's with a blank
> page. Specifically, the result pdf contains a blank page after the coverpage
> instead of the first page of the second document sent to merger utility.
> Whenever this problem occurs, we see the following line in our logs:
> {{org.apache.pdfbox.pdfparser.NonSequentialPDFParser - Can't find the object
> 52 0 (origin offset 7187557)}}
> I'll try to attach/link an example pdf soon, but currently I don't have
> permission to redistribute any files that exhibit the problem. However,
> here's a simple snippet that replicates the problem - it's pretty
> straightforward.
> {code}
> @Test
> public void testMergeNonSeq() throws IOException, COSVisitorException {
> destinationPdf = new File(TMP_FOLDER, "result-nonseq.pdf");
> PDFMergerUtility ut = new PDFMergerUtility();
> RandomAccess ram = new
> RandomAccessFile(File.createTempFile("mergeram", ".bin"), "rw");
> ut.addSource(coverpagePdf);
> ut.addSource(documentPdf);
> ut.setDestinationFileName(destinationPdf.getCanonicalPath());
> ut.mergeDocumentsNonSeq(ram);
>
> //the only automated way we have to tell that something went wrong is
> to check the size of the result
> assertThat("destination pdf should be larger than the original pdf",
> destinationPdf.length(), is( greaterThan(documentPdf.length())));
> }
> {code}
> Note we only see this problem with PDFMergerUtility.mergeDocumentsNonSeq().
> Using PDFMergerUtility.mergeDocuments() does not exhibit any problems.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]