[
https://issues.apache.org/jira/browse/PDFBOX-4963?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17207856#comment-17207856
]
Maison commented on PDFBOX-4963:
--------------------------------
Yes, the font cache holding Closeable objects in SoftReferences is a java
challenge... Unless fonts caching is refactored, I am afraid that using a
reference queue is the only way to go.
Note that when referent becomes unreachable and the rerefence is cleared, the
reference is somehow "guaranteed" to be added to the reference queue : this
occurs before finalization (which can be delayed).
Now, emptying this queue can be performed by the thread who calls getFont() or
addFont() : this is exactly what guava cache does when SoftReferences are used
: SoftReferences are created with a queue 'segment.valueReferenceQueue' here :
[https://github.com/google/guava/blob/master/guava/src/com/google/common/cache/LocalCache.java#L400]
and this queue is processed in drainValueReferenceQueue() :
[https://github.com/google/guava/blob/master/guava/src/com/google/common/cache/LocalCache.java#L2468]
The processing is performed at several steps : get(), if possible, and IIUC at
each write [see line 3427 : preWriteCleanup() calls runLockedCleanup()]
I couldn't find a way to use this ref queue pattern without a getter \{{
TrueTypeFont.getRAFStream()}}, which indeed slightly pollutes API.
> TTF file leakage in font cache
> ------------------------------
>
> Key: PDFBOX-4963
> URL: https://issues.apache.org/jira/browse/PDFBOX-4963
> Project: PDFBox
> Issue Type: Bug
> Components: PDModel
> Affects Versions: 2.0.21
> Reporter: Maison
> Priority: Major
>
> We observe many TTF opened files in our production server, which result in
> exhausting file descriptors.
> We have checked and rechecked that every PDDocument is properly closed (try
> with resource everywhere).
> By looking at pdfbox source code, I suspect 2 problems in FontCache and in
> FileSystemFontProvider
> 1 - FontCache
> In FontCache, a map keeps SoftReference<FontBoxFont> as values.
> IIUC for TTF fonts, the values are instances of
> org.apache.fontbox.ttf.TrueTypeFont. Such instances have a TTFDataStream
> member, which is RAFDataStream (so there is an opened file).
> Problem is that if the soft reference is cleared by GC, we can suppose the
> TrueTypeFont objects are GCed (is that guaranteed?) ; but what about the
> RAFDataStream sub-object ? There is no RAFDataStream.close() in TrueTypeFont
> finalizer
> 2 - FileSystemFontProvider
> There seems to be a TOCTOU-like race condition when a font is needed. Code
> looks like below (simplified) :
> @Override
> public FontBoxFont getFont()
> {
> FontBoxFont cached = parent.cache.getFont(this);
> if (cached != null) {
> return cached;
> }
> FontBoxFont font = ... // instantiate font
> parent.cache.addFont(this, font); // <--- not thread safe ?
> return font;
> }
> The font, if not in cache, is instantiated and added into cache. But two
> threads can do that at the same time, and the last addFont() wins. So the
> first SoftReference<FontBoxFont> object is now eligible to GC; but the
> FontBoxFont has not been closed.
> This problem is probably less frequent.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]