[ 
https://issues.apache.org/jira/browse/PDFBOX-1920?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13920743#comment-13920743
 ] 

James Carter commented on PDFBOX-1920:
--------------------------------------

Hi Timo, I'm the developer of the 3rd party solution that is using PDFBox in 
this case.  If I understand the thread correctly, 3rd party PDF applications 
are creating invalid PDFs that PDFBox attempts to 'repair'. I've tried 
increasing the pushBackSize property, but encountering a different exception 
during the merge (I've included the code excerpt + exception below). Is this 
something that PDFBox could handle/repair, or do we need to handle this 
elsewhere? (E.g validate the PDFs users are uploading and tell them if it's 
invalid)

System.setProperty("org.apache.pdfbox.baseParser.pushBackSize", "999000");
PDFMergerUtility mergePdf = new PDFMergerUtility();
FileOutputStream fos = new FileOutputStream("test.pdf");

mergePdf.addSource("docs/01. Heads of Terms (Signed).pdf");
mergePdf.setDestinationStream(fos);
mergePdf.mergeDocuments();


Exception in thread "main" java.io.IOException: expected='endstream' actual='' 
org.apache.pdfbox.io.PushBackInputStream@45cb0cdc
    at 
org.apache.pdfbox.pdfparser.BaseParser.parseCOSStream(BaseParser.java:609)
    at org.apache.pdfbox.pdfparser.PDFParser.parseObject(PDFParser.java:605)
    at org.apache.pdfbox.pdfparser.PDFParser.parse(PDFParser.java:194)
    at org.apache.pdfbox.pdmodel.PDDocument.load(PDDocument.java:1219)
    at org.apache.pdfbox.pdmodel.PDDocument.load(PDDocument.java:1186)
    at 
org.apache.pdfbox.util.PDFMergerUtility.mergeDocuments(PDFMergerUtility.java:196)
    at com.acme.MergePDF.runSmartService(MergePDF.java:52)
    at com.acme.MergePDF.main(MergePDF.java:68)

> Buffer Error when trying to run node
> ------------------------------------
>
>                 Key: PDFBOX-1920
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-1920
>             Project: PDFBox
>          Issue Type: Bug
>          Components: Utilities
>            Reporter: Chris Hewkin
>            Assignee: Timo Boehme
>         Attachments: Application.zip
>
>
> Description: Trying to merge PDF using the latest Merge PDF Node but getting 
> the following error 
> There is a problem with task “Merge PDF” in the process “Create Application 
> Pack” 
> Problem: An error occurred in executing an Activity Class. 
> Details: org.apache.pdfbox.exceptions.WrappedIOException: Could not push back 
> 628696 bytes in order to reparse stream. Try increasing push back buffer 
> using system property org.apache.pdfbox.baseParser.pushBackSize 
> Recommended Action: Examine the activity class to correct the error and then 
> resume. 
> Priority of this problem: High Priority 



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to