Hi,
Am 04.06.24 um 10:44 schrieb Constantine Dokolas:
Hi all!
I have a requirement for PDFBox memory management where a multi-threaded
process that is generating PDF files (one per thread, at most) should share
a certain total amount of RAM (any excess should use scratch files). This
is because PDFs are in the order of thousands of pages and in-memory
resources must come from a limited, common pool for efficient use of heap
memory.
That seems to be a rare combination multiple threads and the creation of
huge pdfs. However, I'd like to share some thoughts
Is there any mechanism available for this purpose? MemoryUsageSetting
appears to control each PDF separately, but I need more flexibility, i.e.
some sort of pooling of RAM/file resources.
I guess this can't be done with 2.0.x as the cache management is
somewhere under the hood, so that configuration is limited to the
possible variations implemented in MemoryUsageSetting as you already said.
I've looked a little into the improvements in 3.0 regarding the "stream
cache" and it could be a solution, albeit with some extra work.
In 3.0 the caching was overhauled. It is limited to write operations
which fits perfectly into your usecase. And more important the user is
able to control the usage of the stream cache.
In your case it should be possible to create exactly one instance of
org.apache.pdfbox.io.ScratchFile using the desired configuration and
re-use it for all pdfs you are creating. The class should be
thread-safe. You might implement your own StreamCache if you need
something more sophisticated
Andreas
Any ideas?
C.D.
---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: users-h...@pdfbox.apache.org