[jira] [Commented] (PDFBOX-3280) PDDocument.importPage does not deep clone source page

Robert Onslow (JIRA) Tue, 18 Oct 2016 12:06:26 -0700

    [ 
https://issues.apache.org/jira/browse/PDFBOX-3280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15586359#comment-15586359
 ]


Robert Onslow commented on PDFBOX-3280:
---------------------------------------

Using this code on the previously attached pdf generates thepreviously  
attached output test.pdf with blank pages, which I suspect is related to the 
problem of deep clonging

try{
                PDDocument doc = PDDocument.load(new 
File("023_CASEROOM-BROCHURE.PDF"));
                //PDPage page = doc.getPage(0);
                PDDocument doc1 = new PDDocument();
                for (int i = 0; i < doc.getNumberOfPages(); i++) {
                        PDPage page = doc.getPage(i);
                        doc1.importPage(page);
                }
                doc1.save("test.pdf");
                doc1.close();
                doc.close();
                
        } catch (Exception x) {throw new RuntimeException(x);}

Robert

> PDDocument.importPage does not deep clone source page
> -----------------------------------------------------
>
>                 Key: PDFBOX-3280
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-3280
>             Project: PDFBox
>          Issue Type: Bug
>          Components: PDModel
>    Affects Versions: 2.0.0, 2.0.1, 2.0.2, 2.1.0
>            Reporter: Cornelis Hoeflake
>         Attachments: 023_CASEROOM-BROCHURE.PDF, 1.pdf, test.pdf
>
>
> The method PDDocument.importPage does not deep clone the source page. This 
> causes two issues, when closing the source document BEFORE saving the target 
> document throws an already closed exception.
> Placing the close after saving the target document works fine. But... When 
> splitting a document into a lot of small documents and than save that 
> documents multithreaded will cause random exceptions like 
> ArrayIndexOutOfBounds, COSStream closed etc.
> Check for example the following code. I attach the used source document.
> {code:title=Test.java|borderStyle=solid}
>         PDDocument doc = new PDDocument();
>         PDDocument load = PDDocument.load(new File(SOURCE_DOC));
>         for (int p = 0; p<1000; p++) {
>             doc.importPage(load.getPage(0));
>         }
>         ByteArrayOutputStream baos = new ByteArrayOutputStream();
>         doc.save(baos);
>         doc.close();
>         load.close();
>         final PDDocument doc2 = PDDocument.load(baos.toByteArray());
> // ok, now we have a big document loaded as it normally will be loaded.
>         ExecutorService es = Executors.newFixedThreadPool(4);
>         List<PDDocument> docs = Lists.newArrayList();
>         for (int p = 0; p<doc2.getNumberOfPages(); p++) {
>             final PDDocument newDoc = new PDDocument();
>             newDoc.importPage(doc2.getPage(p));
>             docs.add(newDoc);
>         }
>         for (int p = 0; p<doc2.getNumberOfPages(); p++) {
>             final int page = p;
>             es.submit(new Runnable() {
>                 @Override
>                 public void run() {
>                     try {
>                         PDDocument newDoc = docs.get(page);
>                         newDoc.save(new ByteArrayOutputStream());
>                         newDoc.close();
>                     } catch (IOException e) {
>                         e.printStackTrace();
>                     }
>                 }
>             });
>         }
>         es.shutdown();
>     }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (PDFBOX-3280) PDDocument.importPage does not deep clone source page

Reply via email to