[ 
https://issues.apache.org/jira/browse/PDFBOX-3142?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15033542#comment-15033542
 ] 

Andreas Lehmkühler commented on PDFBOX-3142:
--------------------------------------------

Sounds like those pdfs are malformed and the non-sequential parser isn't able 
to repair them.

Did you ever give 2.0.0 a try? It contains a lot of improvements and bugfixes 
and not all them are/will be backported to 1.8.x The second RC is availbal 
through the download page

> PDFMergerUtility with scratch file generates result with blank pages for 
> certain source files.
> ----------------------------------------------------------------------------------------------
>
>                 Key: PDFBOX-3142
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-3142
>             Project: PDFBox
>          Issue Type: Bug
>          Components: Utilities
>    Affects Versions: 1.8.10
>         Environment: Ubuntu 14.04.3, java 1.8.0_66
>            Reporter: Jim deVos
>
> My team uses PDFMergerUtility to attach cover pages to various pdfs .   We 
> recently we tried utilizing a scratch file (e.g. 
> PDFMergerUtility.mergeDocumentsNonSeq())  to cut down on the amount of RAM we 
> are using. This approach works for the majority of pdf's in our system, but 
> some files cause the merger utility to generate resultant pdf's with a blank 
> page.  Specifically, the result pdf contains a blank page after the coverpage 
> instead of the first page of the second document sent to merger utility.
> Whenever this problem occurs, we see the following line in our logs:
> {{org.apache.pdfbox.pdfparser.NonSequentialPDFParser - Can't find the object 
> 52 0 (origin offset 7187557)}}
> I'll try to attach/link an example pdf soon, but currently I don't have 
> permission to redistribute any files that exhibit the problem.  However,  
> here's a simple snippet that replicates the problem - it's pretty 
> straightforward.
> {code}
>     @Test
>     public void testMergeNonSeq() throws IOException, COSVisitorException {
>         destinationPdf = new File(TMP_FOLDER, "result-nonseq.pdf");
>         PDFMergerUtility ut = new PDFMergerUtility();
>         RandomAccess ram = new 
> RandomAccessFile(File.createTempFile("mergeram", ".bin"), "rw");
>         ut.addSource(coverpagePdf);
>         ut.addSource(documentPdf);
>         ut.setDestinationFileName(destinationPdf.getCanonicalPath());
>         ut.mergeDocumentsNonSeq(ram);  
>         
>         //the only automated way we have to tell that something went wrong is 
> to check the size of the result
>         assertThat("destination pdf should be larger than the original pdf", 
> destinationPdf.length(), is( greaterThan(documentPdf.length())));
>     }
> {code}
> Note we only see this problem with PDFMergerUtility.mergeDocumentsNonSeq().  
> Using PDFMergerUtility.mergeDocuments() does not exhibit any problems.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org

Reply via email to