Hi,

first let me give some credits to the developers of pdfbox for this very usable tool. Please continue your work, guys!

I have a web application storing lots of PDF documents in a database. For easier bulk download and printing, I am using pdfbox to merge multiple PDF documents into one large PDF document for download. The destination stream of the merge is the HTTP output stream, so the merged PDF data goes directly to the requesting web client.

Today I learned by a "too many open files" error, that pdfbox creates a temporary file for each source input stream and keeps it open until the end of the merge process (I tried to merge 1025 PDF sources into one PDF on a Linux box). Is this behaviour necessary, maybe caused by the PDF format? However, I was able to handle it by increasing the open file limit of the user.

When does pdfbox write the first bytes into the merge output stream? Does it happen during the merge process or after the last source has been merged? So, does the requesting web client has to wait for the download to start until all sources have been merged or not?

Thanks for information
Joern

Reply via email to