Tim Shaffer created PDFBOX-4908:
-----------------------------------

             Summary: PDFMergerUtility.mergeInto() does not deep copy metadata
                 Key: PDFBOX-4908
                 URL: https://issues.apache.org/jira/browse/PDFBOX-4908
             Project: PDFBox
          Issue Type: Bug
          Components: Utilities
    Affects Versions: 2.0.18
         Environment: Windows, JDK12
            Reporter: Tim Shaffer
         Attachments: bad1.pdf, bad2.pdf, blank.pdf

After merging two documents, closing the source document prevents the 
destination document from being saved.
{code:java}
// mainDoc can be any existing PDF
PDDocument mainDoc = PDDocument.load(new File("blank.pdf"));
PDDocument appendDoc = PDDocument.load(new File("bad1.pdf"));
//PDDocument appendDoc = PDDocument.load(new File("bad2.pdf"));

PDFMergerUtility pdfMerger = new PDFMergerUtility();
pdfMerger.appendDocument(mainDoc, appendDoc);
appendDoc.close();
// Exception thrown during save()
mainDoc.save("temp.pdf");
mainDoc.close();
{code}
Exception:
{noformat}
java.io.IOException: COSStream has been closed and cannot be read. Perhaps its 
enclosing PDDocument has been closed?
        at org.apache.pdfbox.cos.COSStream.checkClosed(COSStream.java:83)
        at 
org.apache.pdfbox.cos.COSStream.createRawInputStream(COSStream.java:133)
        at 
org.apache.pdfbox.pdfwriter.COSWriter.visitFromStream(COSWriter.java:1219)
        at org.apache.pdfbox.cos.COSStream.accept(COSStream.java:404)
        at org.apache.pdfbox.cos.COSObject.accept(COSObject.java:158)
        at 
org.apache.pdfbox.pdfwriter.COSWriter.doWriteObject(COSWriter.java:526)
        at 
org.apache.pdfbox.pdfwriter.COSWriter.doWriteObjects(COSWriter.java:464)
        at org.apache.pdfbox.pdfwriter.COSWriter.doWriteBody(COSWriter.java:448)
        at 
org.apache.pdfbox.pdfwriter.COSWriter.visitFromDocument(COSWriter.java:1113)
        at org.apache.pdfbox.cos.COSDocument.accept(COSDocument.java:449)
        at org.apache.pdfbox.pdfwriter.COSWriter.write(COSWriter.java:1386)
        at org.apache.pdfbox.pdfwriter.COSWriter.write(COSWriter.java:1273)
        at org.apache.pdfbox.pdmodel.PDDocument.save(PDDocument.java:1357)
        at org.apache.pdfbox.pdmodel.PDDocument.save(PDDocument.java:1328)
        at org.apache.pdfbox.pdmodel.PDDocument.save(PDDocument.java:1316)
        at Main.main(Main.java:60)
{noformat}
Attached are two different PDFs, from different sources, that both cause the 
bug.  All sensitive data has been removed, so the PDFs only contain blank 
pages, but the structure is still present which causes the above Exception.  
Also attached is blank.pdf (another blank doc) that I've been testing with as 
the destination.

The cause seems to be these lines in PDFMergerUtility:
{code:java}
 PDDocumentInformation destInfo = destination.getDocumentInformation();
 PDDocumentInformation srcInfo = source.getDocumentInformation();
 mergeInto(srcInfo.getCOSObject(), destInfo.getCOSObject(), 
Collections.<COSName>emptySet());

{code}
I've tried altering the code to use PDFCloneUtility to clone the 
srcInfo.getCOSObject() before passing it to mergeInto().  That seems to fix the 
issue, but I'm not familiar enough with the code to say if that is the correct 
way to fix this.

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to