Hi Maruan,
the temporary files are not a problem, I just wanted to know if it is
necessary to keep them open until the merge is finished. Your answer
implies the need to keep them open, so let it be.
I distilled the code I am using:
private static void downloadMergedPDF(HttpServletResponse response,
List<InputStream> documentList, String fileName)
throws IOException, COSVisitorException {
response.setContentType("application/pdf");
response.setContentLength(-1);
response.addHeader("Content-disposition", "attachment; filename=" +
fileName);
OutputStream output = response.getOutputStream();
PDFMergerUtility merger = new PDFMergerUtility();
for (InputStream document : documentList) {
merger.addSource(document);
}
merger.setDestinationStream(output);
merger.mergeDocuments();
output.flush();
output.close();
}
I still want to know when the merger starts writing bytes to the output
stream, already during the merge or after the merge has finished? This
is important for me to estimate the time the user has to wait for the
download to begin.
Regards
Joern
Am 07.12.2013 09:05, schrieb Maruan Sahyoun:
Hi Joern,
you could do it completely in memory but at the cost of memory consumption as
all files have to be kept until the merge finishes. So from my perspective
adjusting the open file limit is a better option.
Maybe you can post a code snippet how you load the files and do the merging.
Maybe there is some easy way to improve that.
BR
Maruan Sahyoun
Am 07.12.2013 um 01:01 schrieb Jörn Haferstroh <[email protected]>:
Hi,
first let me give some credits to the developers of pdfbox for this very usable
tool. Please continue your work, guys!
I have a web application storing lots of PDF documents in a database. For
easier bulk download and printing, I am using pdfbox to merge multiple PDF
documents into one large PDF document for download. The destination stream of
the merge is the HTTP output stream, so the merged PDF data goes directly to
the requesting web client.
Today I learned by a "too many open files" error, that pdfbox creates a
temporary file for each source input stream and keeps it open until the end of the merge
process (I tried to merge 1025 PDF sources into one PDF on a Linux box). Is this
behaviour necessary, maybe caused by the PDF format? However, I was able to handle it by
increasing the open file limit of the user.
When does pdfbox write the first bytes into the merge output stream? Does it
happen during the merge process or after the last source has been merged? So,
does the requesting web client has to wait for the download to start until all
sources have been merged or not?
Thanks for information
Joern