[ 
https://issues.apache.org/jira/browse/PDFBOX-4242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16521927#comment-16521927
 ] 

Emmeran Seehuber commented on PDFBOX-4242:
------------------------------------------

[~tilman] Yes, using a finalizer is a no go, as it will likely never run. The 
way to go here is to use a PhantomReference and a ReferenceQueue to close the 
file handle. E.g.

 
{code:java}
class TTFPhantomReference extends PhantomReference<TrueTypeFont> {
  TTFPhantomReference(TrueTypeFont font) {
    super(font, TTFReferenceQueue.INSTANCE);
    dataToClose = font.data;
    // Pin this reference, otherwise it will be GCed before it can do it's 
magic...
    TTFReferenceQueue.referencePin.add(this);
  }

  TTFDataStream dataToClose;
}

class TTFReferenceQueue {
  static final TTFReferenceQueue INSTANCE = new TTFReferenceQueue();
 /* Pin list, to not have the Referencer GCed before it could cleanup the 
objects */
 ConcurrentLinkedDeque<TTFPhantomReference> referencePin = new 
ConcurrentLinkedDeque<TTFPhantomReference>();

  /* The ugly part, we need a thread to poll the queue */
 static final Thread ttfPollThread = new Thread(){
  public void run(){
  while(true) {
    // Block till we get a reference.   
    TTFPhantomReference ref =  INSTANCE.remove();
    if(ref != null ) {
      ref.close();
      referencePin.remove(ref);
    }
  }
}
};
static {
  ttfPollThread.setDeamon(true);
  ttfPollThread.start();
}
}
{code}
No idea if the TrueTypeFont should be the object that is referenced in the 
PhantomReference, or if it should be a parent object (e.g. PDFont). The Phantom 
Reference just needs a reference to the resource that should be closed but of 
course not a reference to the owner object. This avoids the finalizer() 
resurrection problem, as the reference object will already be gone.

The problem with this code is: It creates a static thread, which in turn may 
lead to a class loader leak. 

Guava does some special magic to avoid this problem and make it work correctly 
with container environments. If you could use Guava you would just make this 
code look like:
{code:java}
public class TTFFileCloser {
  private static FinalizableReferenceQueue queue = new 
FinalizableReferenceQueue();
  private ConcurrentLinkedDeque<FinalizablePhantomReference<TrueTypeFont>> 
phantomReferencePinner = new ConcurrentLinkedDeque<>();

  public void closeFileIfNotReachable(TrueTypeFont owner, final TTFDataStream 
data) {
    FinalizablePhantomReference<TrueTypeFont> finalizablePhantomReference = new 
FinalizablePhantomReference<TrueTypeFont>(owner, queue) {
       @Override
       public void finalizeReferent() {  
         data.close();
         phantomReferencePinner.remove(this);
       }
    };
    // Pin the reference to avoid it beeing GCed 
    phantomReferencePinner.add(finalizablePhantomReference);
  }
}
{code}
PhantomReferences are queued very fast after the owner object is no longer 
reachable. I use this (with the Guava base classes) very successful to cleanup 
all my temp files. As you can't use Guava I don't know if you want to go this 
route, as getting this right in a servlet context is not that easy... Note: You 
don't need to create a background thread to do the cleanup, Guava has a 
fallback that it just will cleanup all references that are queued if a new 
reference is added to the list. But it only uses this if it can't create a 
background thread.

> Fontbox does not close file descriptor when loading fonts.
> ----------------------------------------------------------
>
>                 Key: PDFBOX-4242
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-4242
>             Project: PDFBox
>          Issue Type: Bug
>          Components: FontBox
>    Affects Versions: 2.0.9
>            Reporter: Glen Peterson
>            Priority: Minor
>              Labels: file_leak
>
> My app has been getting "java.io.FileNotFoundException (No file descriptors 
> available)" and I've confirmed that it's because fontbox isn't closing it's 
> file descriptors.
> In org.apache.fontbox.ttf.TTFParser there's this method:
> {{public TrueTypeFont parse(File ttfFile) throws IOException {}}
>  {{  RAFDataStream raf = new RAFDataStream(ttfFile, "r");}}
> {{  try {}}
>  {{    return this.parse((TTFDataStream)raf);}}
>  {{  } catch (IOException var4) {}}
>  {{    // close only on error (file is still being accessed later)}}
>  {{    raf.close();}}
>  {{    throw var4;}}
>  {{}}}
>  {{}}}
> I would have expected to see the close() in a finally block so that the file 
> is always closed, not just on exceptions. Presumably, you can keep it in 
> memory without leaving the file descriptor open?
> {{public TrueTypeFont parse(File ttfFile) throws IOException {}}
>  {{  RAFDataStream raf = new RAFDataStream(ttfFile, "r");}}
> {{  try {}}
>  {{    return this.parse((TTFDataStream)raf);}}
>  {{  } catch (IOException var4) {}}{{    raf.close();}}
>  {{    throw var4;}}
>  {{  } finally {}}
>  {{    raf.close();}}
>  {{}}}
>  {{}}}
> I tried performing this in a lazy initialization, but it blew up:
> java.lang.RuntimeException: java.io.IOException: The TrueType font null does 
> not contain a 'cmap' tableCaused by: java.io.IOException: The TrueType font 
> null does not contain a 'cmap' table
>   at 
> org.apache.fontbox.ttf.TrueTypeFont.getUnicodeCmapImpl(TrueTypeFont.java:548)
>   at 
> org.apache.fontbox.ttf.TrueTypeFont.getUnicodeCmapLookup(TrueTypeFont.java:528)
>   at 
> org.apache.fontbox.ttf.TrueTypeFont.getUnicodeCmapLookup(TrueTypeFont.java:514)
>   at org.apache.fontbox.ttf.TTFSubsetter.<init>(TTFSubsetter.java:91)
>   at 
> org.apache.pdfbox.pdmodel.font.TrueTypeEmbedder.subset(TrueTypeEmbedder.java:321)
>   at org.apache.pdfbox.pdmodel.font.PDType0Font.subset(PDType0Font.java:239)
>   at org.apache.pdfbox.pdmodel.PDDocument.save(PDDocument.java:1271)
> Thoughts?
> Thanks for PDFBox - it's been really helpful!



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org

Reply via email to