[ https://issues.apache.org/jira/browse/PDFBOX-3280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15338497#comment-15338497 ]
ASF subversion and git services commented on PDFBOX-3280: --------------------------------------------------------- Commit 1749157 from [~tilman] in branch 'pdfbox/branches/2.0' [ https://svn.apache.org/r1749157 ] PDFBOX-3280: revert 1741294 because it resulted in split creating huge files > PDDocument.importPage does not deep clone source page > ----------------------------------------------------- > > Key: PDFBOX-3280 > URL: https://issues.apache.org/jira/browse/PDFBOX-3280 > Project: PDFBox > Issue Type: Bug > Components: PDModel > Affects Versions: 2.0.0, 2.0.1, 2.0.2, 2.1.0 > Reporter: Cornelis Hoeflake > Attachments: 1.pdf > > > The method PDDocument.importPage does not deep clone the source page. This > causes two issues, when closing the source document BEFORE saving the target > document throws an already closed exception. > Placing the close after saving the target document works fine. But... When > splitting a document into a lot of small documents and than save that > documents multithreaded will cause random exceptions like > ArrayIndexOutOfBounds, COSStream closed etc. > Check for example the following code. I attach the used source document. > {code:title=Test.java|borderStyle=solid} > PDDocument doc = new PDDocument(); > PDDocument load = PDDocument.load(new File(SOURCE_DOC)); > for (int p = 0; p<1000; p++) { > doc.importPage(load.getPage(0)); > } > ByteArrayOutputStream baos = new ByteArrayOutputStream(); > doc.save(baos); > doc.close(); > load.close(); > final PDDocument doc2 = PDDocument.load(baos.toByteArray()); > // ok, now we have a big document loaded as it normally will be loaded. > ExecutorService es = Executors.newFixedThreadPool(4); > List<PDDocument> docs = Lists.newArrayList(); > for (int p = 0; p<doc2.getNumberOfPages(); p++) { > final PDDocument newDoc = new PDDocument(); > newDoc.importPage(doc2.getPage(p)); > docs.add(newDoc); > } > for (int p = 0; p<doc2.getNumberOfPages(); p++) { > final int page = p; > es.submit(new Runnable() { > @Override > public void run() { > try { > PDDocument newDoc = docs.get(page); > newDoc.save(new ByteArrayOutputStream()); > newDoc.close(); > } catch (IOException e) { > e.printStackTrace(); > } > } > }); > } > es.shutdown(); > } > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org