[
https://issues.apache.org/jira/browse/PDFBOX-5530?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17745951#comment-17745951
]
Andreas Lehmkühler commented on PDFBOX-5530:
--------------------------------------------
The issue with the implementation of 2.0.x is the fact that all buffers are
sharing the same page size. Choosing a value is always a compromise. Small
streams are wasting a lot of memory if the page size is way much bigger than
the stream size. If there are lots of small streams such as in your pdf the
wasted amount of memory sums up to a huge value. Saying that, it isn't
sufficient to count the streams, you have to have a look at the (estimated)
size as well to determine if there is any chance to reduce the memory footprint
by reducing the page size.
> Java heap space
> ---------------
>
> Key: PDFBOX-5530
> URL: https://issues.apache.org/jira/browse/PDFBOX-5530
> Project: PDFBox
> Issue Type: Bug
> Affects Versions: 2.0.25
> Reporter: liu
> Priority: Blocker
> Attachments: image-2022-10-20-14-30-19-790.png,
> image-2022-10-20-14-30-57-332.png, image-2022-10-20-14-32-10-258.png,
> image-2022-10-20-15-01-06-688.png, image-2022-10-20-19-07-42-632.png,
> image-2022-10-20-19-08-23-932.png, screenshot-1.png, screenshot-2.png,
> screenshot-3.png, screenshot-4.png, 引起宕机-1.pdf, 引起宕机.pdf
>
>
> code(only this part of the code):
> PDDocument load = PDDocument.load(file,
> MemoryUsageSetting.setupTempFileOnly(-1);
>
> hi. Why do I configure it like this, it still takes up so much memory? What
> is the effect of using setupTempFileOnly.
> !image-2022-10-20-14-30-19-790.png!
> !image-2022-10-20-14-30-57-332.png!
> !image-2022-10-20-14-32-10-258.png!
> [^引起宕机.pdf]
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]