[ 
https://issues.apache.org/jira/browse/PDFBOX-4396?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ben Manes resolved PDFBOX-4396.
-------------------------------
    Resolution: Workaround

> Memory leak due to soft reference caching
> -----------------------------------------
>
>                 Key: PDFBOX-4396
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-4396
>             Project: PDFBox
>          Issue Type: Bug
>    Affects Versions: 2.0.12
>         Environment: JDK10; G1
>            Reporter: Ben Manes
>            Priority: Major
>         Attachments: #2 - memory leak 2.png, #2 - memory leak.png, memory 
> leak 2.png, memory leak.png
>
>
> In a heap dump, it appears that DefaultResourceCache is retaining 5.3 GB of 
> memory due to buffered images (via PDImageXObject). I suspect that G1 is not 
> collecting soft references across all regions before it out-of-memory errors.
> In PDFBOX-4389, I discovered very slow PDDocument#load times due to a JDK10 
> I/O bug. Previously I was loading the document to render each page, but this 
> took 1.5 minutes. To work around that bug I reused the document instance 
> across pages. This seems to have fail because the pages were cached and not 
> cleared by the GC.
> The DefaultResourceCache does not prune its cache entries when the soft 
> references are collected. Like WeakHashMap, it should use a ReferenceQueue, 
> poll it on every access, and prune accordingly.
> Thankfully PDDocument#setResourceCache exists. For now I am going to reset 
> the cache to a new instance after a page has been rendered. The entries 
> should no longer be reachable and be GC'd more aggressively. If that doesn't 
> work, I'll either replace the cache (e.g. with Caffeine) or disable it by 
> setting the instance to null.
> I think the desired fix is to prune the DefaultResourceCache and, ideally, 
> reconsider usage of soft references (as they tend to be poor in practice). 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org

Reply via email to