[ 
https://issues.apache.org/jira/browse/PDFBOX-5486?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17576372#comment-17576372
 ] 

Andreas Lehmkühler commented on PDFBOX-5486:
--------------------------------------------

[~tilman] That is still correct. The origin idea of the on demand parsing of 
glyph data was to minimize the time to load huge external fonts if just some of 
the glyphs are needed for rendering, see PDFBOX-2303

PDFBox 2.0.x doesn't close true type fonts until the corresponding pdf document 
is closed or the finalizer closes them. Furthermore some external fonts are 
cached and PDFBox keeps them open until the JVM is terminated. Some of the font 
data is copied to ScratchFileBuffers and depending on the configuration ends up 
in the memory.

In the current trunk some of the caches are removed and the font data is copied 
to memory before parsing it. In many cases the input data of the parser is 
closed after parsing it so that the memory is released. Just the data for the 
glyph data is chached in memory so that the on demand creation of the glyphs 
still works.

True type fonts keep the origin data of the font if they are embedded and 
aren't closed until the corresponding pdf is closed. The font embedding code 
needs the origin data. That is on my TODO list for 4.0.x or later as well. 

Some parts of the code have concurrent caching mechanisms and those doesn't 
complement one another but may be sometimes counterproductive. At least the 
code is hard to maintain.

I guess everybody knows my opinion about that ;-) Let us remove such 
constructs, simplify and see where it ends. If there is really need for some 
caching we might reimplement something new which suits better to the structures 
we have. It is not a new and of course not my finding that every now or then a 
refactoring is a good idea needed to break up old structures.


> "RandomAccessBuffer already closed" when opening smaller fonts
> --------------------------------------------------------------
>
>                 Key: PDFBOX-5486
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-5486
>             Project: PDFBox
>          Issue Type: Bug
>    Affects Versions: 3.0.0 PDFBox
>            Reporter: Tilman Hausherr
>            Assignee: Andreas Lehmkühler
>            Priority: Major
>             Fix For: 3.0.0 PDFBox
>
>
> I wonder if this is related to one of the memory management / inputstream 
> changes, PDTrueTypeFont.load() can't load smaller ttf fonts (I discovered 
> this while working with the font from PDFBOX-5484):
> {code}
>     public static void main(String[] args) throws IOException
>     {
>         File fontDir = new File("C:/windows/fonts");
>         File[] files = fontDir.listFiles((File dir, String name) -> 
> name.toLowerCase().endsWith(".ttf"));
>         for (File file : files)
>         {
>             PDDocument doc = new PDDocument();
>             PDTrueTypeFont ttf = PDTrueTypeFont.load(doc, file, 
> WinAnsiEncoding.INSTANCE);
>             if (ttf.hasGlyph("A"))
>             {
>                 try
>                 {
>                     ttf.getPath("A");
>                 }
>                 catch (IOException ex)
>                 {
>                     System.out.println("font " + ttf.getName() + " failed, 
> size: " + file.length() +
>                             ", glyphs: " + 
> ttf.getTrueTypeFont().getNumberOfGlyphs() + ": " + ex.getMessage());
>                     ex.printStackTrace();
>                 }
>             }
>         }
>     }
> {code}
> {noformat}
> font BookAntiqua-Bold failed, size: 151000, glyphs: 669: RandomAccessBuffer 
> already closed
> java.io.IOException: RandomAccessBuffer already closed
>     at 
> org.apache.pdfbox.io.RandomAccessReadBuffer.checkClosed(RandomAccessReadBuffer.java:337)
>     at 
> org.apache.pdfbox.io.RandomAccessReadBuffer.getPosition(RandomAccessReadBuffer.java:188)
>     at 
> org.apache.fontbox.ttf.RandomAccessReadDataStream.getCurrentPosition(RandomAccessReadDataStream.java:80)
>     at org.apache.fontbox.ttf.GlyphTable.getGlyph(GlyphTable.java:135)
>     at 
> org.apache.pdfbox.pdmodel.font.PDTrueTypeFont.getPath(PDTrueTypeFont.java:498)
>  
> {noformat}
> It does not happen with larger fonts, e.g. Arial.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to