[ 
https://issues.apache.org/jira/browse/PDFBOX-3082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14992382#comment-14992382
 ] 

John Hewson commented on PDFBOX-3082:
-------------------------------------

This isn't a font cache issue. It's been a problem ever since we started 
scanning the local system for fonts in 2.0. It's become worse now that we can 
load TTC files, as those tend to be especially large.

The problem is that FontBox's parsers are greedy with memory, especially the 
CFF parser. This is basically because most of the parsing is done up-front 
rather than "on demand" and large amounts of data are copied into auxiliary 
data structures, as well as an in-memory cache of the entire file. So there can 
be very high heap usage when parsing large fonts from disk.

> High memory consumption while building font cache
> -------------------------------------------------
>
>                 Key: PDFBOX-3082
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-3082
>             Project: PDFBox
>          Issue Type: Bug
>          Components: PDModel
>    Affects Versions: 2.0.0
>            Reporter: Maruan Sahyoun
>            Priority: Blocker
>             Fix For: 2.0.0
>
>         Attachments: heap-usage.png
>
>
> When the font cache is build there is a very high memory consumption.
> For this small program
> {code}
>     public static void main(String[] args)
>     {
>         PDFont font = PDType1Font.HELVETICA;
>         try
>         {
>             System.in.read();
>         }
>         catch (IOException e)
>         {
>             // TODO Auto-generated catch block
>             e.printStackTrace();
>         }
>     }
> {code}
> I need to set {{-Xmx1512M}} in order to avoid an OOM
> Smaller memory settings will bring this
> {code}
> $ java -Xmx1256M -jar Test.jar 
> Nov 03, 2015 2:48:32 AM org.apache.pdfbox.pdmodel.font.FileSystemFontProvider 
> loadCache
> WARNUNG: New fonts found, font cache will be re-built
> Nov 03, 2015 2:48:32 AM org.apache.pdfbox.pdmodel.font.FileSystemFontProvider 
> <init>
> WARNUNG: Building font cache, this may take a while
> Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
>       at org.apache.fontbox.cff.IndexData.initData(IndexData.java:95)
>       at org.apache.fontbox.cff.CFFParser.readIndexData(CFFParser.java:163)
>       at org.apache.fontbox.cff.CFFParser.parseFont(CFFParser.java:393)
>       at org.apache.fontbox.cff.CFFParser.parse(CFFParser.java:115)
>       at org.apache.fontbox.ttf.CFFTable.read(CFFTable.java:53)
>       at org.apache.fontbox.ttf.TrueTypeFont.readTable(TrueTypeFont.java:377)
>       at org.apache.fontbox.ttf.OpenTypeFont.getCFF(OpenTypeFont.java:61)
>       at 
> org.apache.pdfbox.pdmodel.font.FileSystemFontProvider.addTrueTypeFontImpl(FileSystemFontProvider.java:432)
>       at 
> org.apache.pdfbox.pdmodel.font.FileSystemFontProvider.addTrueTypeCollection(FileSystemFontProvider.java:344)
>       at 
> org.apache.pdfbox.pdmodel.font.FileSystemFontProvider.scanFonts(FileSystemFontProvider.java:243)
>       at 
> org.apache.pdfbox.pdmodel.font.FileSystemFontProvider.<init>(FileSystemFontProvider.java:224)
>       at 
> org.apache.pdfbox.pdmodel.font.FontMapperImpl$DefaultFontProvider.<clinit>(FontMapperImpl.java:132)
>       at 
> org.apache.pdfbox.pdmodel.font.FontMapperImpl.getProvider(FontMapperImpl.java:151)
>       at 
> org.apache.pdfbox.pdmodel.font.FontMapperImpl.findFont(FontMapperImpl.java:415)
>       at 
> org.apache.pdfbox.pdmodel.font.FontMapperImpl.findFontBoxFont(FontMapperImpl.java:378)
>       at 
> org.apache.pdfbox.pdmodel.font.FontMapperImpl.getFontBoxFont(FontMapperImpl.java:352)
>       at 
> org.apache.pdfbox.pdmodel.font.PDType1Font.<init>(PDType1Font.java:114)
>       at 
> org.apache.pdfbox.pdmodel.font.PDType1Font.<clinit>(PDType1Font.java:76)
>       at PDFontTest.main(PDFontTest.java:11)
> {code}
> Possible cause is a large number of fonts on my system
> {code}
> Nov 03, 2015 2:56:01 AM org.apache.pdfbox.pdmodel.font.FileSystemFontProvider 
> loadCache
> WARNUNG: New fonts found, font cache will be re-built
> Nov 03, 2015 2:56:01 AM org.apache.pdfbox.pdmodel.font.FileSystemFontProvider 
> <init>
> WARNUNG: Building font cache, this may take a while
> Nov 03, 2015 2:56:22 AM org.apache.pdfbox.pdmodel.font.FileSystemFontProvider 
> saveCache
> WARNUNG: Finished building font cache, found 876 fonts
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to