[
https://issues.apache.org/jira/browse/PDFBOX-5727?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Tilman Hausherr updated PDFBOX-5727:
------------------------------------
Description:
Kjetil Ødegaard on the users mailing list reported that the start of PDFBox
takes a long time during font collection and that two .ttc fonts that fail are
tried every time. He produced a stack trace of an exception that is caught:
{noformat}
java.io.EOFException
at
org.apache.fontbox.ttf.TTFDataStream.readUnsignedShort(TTFDataStream.java:154)
at
org.apache.fontbox.ttf.TTFDataStream.readUnsignedShortArray(TTFDataStream.java:188)
at
org.apache.fontbox.ttf.GlyphSubstitutionTable.readMultipleSubstitutionSubtable(GlyphSubstitutionTable.java:412)
at
org.apache.fontbox.ttf.GlyphSubstitutionTable.readLookupSubtable(GlyphSubstitutionTable.java:263)
at
org.apache.fontbox.ttf.GlyphSubstitutionTable.readLookupTable(GlyphSubstitutionTable.java:313)
at
org.apache.fontbox.ttf.GlyphSubstitutionTable.readLookupList(GlyphSubstitutionTable.java:247)
at
org.apache.fontbox.ttf.GlyphSubstitutionTable.read(GlyphSubstitutionTable.java:102)
at org.apache.fontbox.ttf.TrueTypeFont.readTable(TrueTypeFont.java:365)
at org.apache.fontbox.ttf.TTFParser.parseTables(TTFParser.java:165)
at org.apache.fontbox.ttf.TTFParser.parse(TTFParser.java:144)
at
org.apache.fontbox.ttf.TrueTypeCollection.getFontAtIndex(TrueTypeCollection.java:127)
at
org.apache.fontbox.ttf.TrueTypeCollection.processAllFonts(TrueTypeCollection.java:109)
at
org.apache.pdfbox.pdmodel.font.FileSystemFontProvider.addTrueTypeCollection(FileSystemFontProvider.java:665)
at
org.apache.pdfbox.pdmodel.font.FileSystemFontProvider.scanFonts(FileSystemFontProvider.java:396)
at
org.apache.pdfbox.pdmodel.font.FileSystemFontProvider.<init>(FileSystemFontProvider.java:367)
at
org.apache.pdfbox.pdmodel.font.FontMapperImpl$DefaultFontProvider.<clinit>(FontMapperImpl.java:139)
at
org.apache.pdfbox.pdmodel.font.FontMapperImpl.getProvider(FontMapperImpl.java:158)
at
org.apache.pdfbox.pdmodel.font.FontMapperImpl.findFont(FontMapperImpl.java:416)
at
org.apache.pdfbox.pdmodel.font.FontMapperImpl.findFontBoxFont(FontMapperImpl.java:379)
at
org.apache.pdfbox.pdmodel.font.FontMapperImpl.getFontBoxFont(FontMapperImpl.java:353)
at org.apache.pdfbox.pdmodel.font.PDType1Font.<init>(PDType1Font.java:127)
{noformat}
I have a theory why it happens, which is that addTrueTypeCollection() does not
add the font as "*skipexception*" to the cache file because it's not done in
the exception handler.
Gili Tzabari suggested to use CRC32 instead of SHA512.
was:
Kjetil Ødegaard on the users mailing list reported that the start of PDFBox
takes a long time during font collection and that two .ttc fonts that fail are
tried every time. He produced a stack trace of an exception that is caught:
{noformat}
java.io.EOFException
at
org.apache.fontbox.ttf.TTFDataStream.readUnsignedShort(TTFDataStream.java:154)
at
org.apache.fontbox.ttf.TTFDataStream.readUnsignedShortArray(TTFDataStream.java:188)
at
org.apache.fontbox.ttf.GlyphSubstitutionTable.readMultipleSubstitutionSubtable(GlyphSubstitutionTable.java:412)
at
org.apache.fontbox.ttf.GlyphSubstitutionTable.readLookupSubtable(GlyphSubstitutionTable.java:263)
at
org.apache.fontbox.ttf.GlyphSubstitutionTable.readLookupTable(GlyphSubstitutionTable.java:313)
at
org.apache.fontbox.ttf.GlyphSubstitutionTable.readLookupList(GlyphSubstitutionTable.java:247)
at
org.apache.fontbox.ttf.GlyphSubstitutionTable.read(GlyphSubstitutionTable.java:102)
at org.apache.fontbox.ttf.TrueTypeFont.readTable(TrueTypeFont.java:365)
at org.apache.fontbox.ttf.TTFParser.parseTables(TTFParser.java:165)
at org.apache.fontbox.ttf.TTFParser.parse(TTFParser.java:144)
at
org.apache.fontbox.ttf.TrueTypeCollection.getFontAtIndex(TrueTypeCollection.java:127)
at
org.apache.fontbox.ttf.TrueTypeCollection.processAllFonts(TrueTypeCollection.java:109)
at
org.apache.pdfbox.pdmodel.font.FileSystemFontProvider.addTrueTypeCollection(FileSystemFontProvider.java:665)
at
org.apache.pdfbox.pdmodel.font.FileSystemFontProvider.scanFonts(FileSystemFontProvider.java:396)
at
org.apache.pdfbox.pdmodel.font.FileSystemFontProvider.<init>(FileSystemFontProvider.java:367)
at
org.apache.pdfbox.pdmodel.font.FontMapperImpl$DefaultFontProvider.<clinit>(FontMapperImpl.java:139)
at
org.apache.pdfbox.pdmodel.font.FontMapperImpl.getProvider(FontMapperImpl.java:158)
at
org.apache.pdfbox.pdmodel.font.FontMapperImpl.findFont(FontMapperImpl.java:416)
at
org.apache.pdfbox.pdmodel.font.FontMapperImpl.findFontBoxFont(FontMapperImpl.java:379)
at
org.apache.pdfbox.pdmodel.font.FontMapperImpl.getFontBoxFont(FontMapperImpl.java:353)
at org.apache.pdfbox.pdmodel.font.PDType1Font.<init>(PDType1Font.java:127)
{noformat}
I have a theory why it happens, which is that addTrueTypeCollection() does not
add the font as "*skipexception*" to the cache file because it's not done in
the exception handler.
> Font operation takes a long time with 3.0.1
> -------------------------------------------
>
> Key: PDFBOX-5727
> URL: https://issues.apache.org/jira/browse/PDFBOX-5727
> Project: PDFBox
> Issue Type: Bug
> Affects Versions: 2.0.30, 3.0.1 PDFBox
> Reporter: Tilman Hausherr
> Assignee: Tilman Hausherr
> Priority: Major
> Fix For: 2.0.31, 3.0.2 PDFBox, 4.0.0
>
>
> Kjetil Ødegaard on the users mailing list reported that the start of PDFBox
> takes a long time during font collection and that two .ttc fonts that fail
> are tried every time. He produced a stack trace of an exception that is
> caught:
> {noformat}
> java.io.EOFException
> at
> org.apache.fontbox.ttf.TTFDataStream.readUnsignedShort(TTFDataStream.java:154)
> at
> org.apache.fontbox.ttf.TTFDataStream.readUnsignedShortArray(TTFDataStream.java:188)
> at
> org.apache.fontbox.ttf.GlyphSubstitutionTable.readMultipleSubstitutionSubtable(GlyphSubstitutionTable.java:412)
> at
> org.apache.fontbox.ttf.GlyphSubstitutionTable.readLookupSubtable(GlyphSubstitutionTable.java:263)
> at
> org.apache.fontbox.ttf.GlyphSubstitutionTable.readLookupTable(GlyphSubstitutionTable.java:313)
> at
> org.apache.fontbox.ttf.GlyphSubstitutionTable.readLookupList(GlyphSubstitutionTable.java:247)
> at
> org.apache.fontbox.ttf.GlyphSubstitutionTable.read(GlyphSubstitutionTable.java:102)
> at org.apache.fontbox.ttf.TrueTypeFont.readTable(TrueTypeFont.java:365)
> at org.apache.fontbox.ttf.TTFParser.parseTables(TTFParser.java:165)
> at org.apache.fontbox.ttf.TTFParser.parse(TTFParser.java:144)
> at
> org.apache.fontbox.ttf.TrueTypeCollection.getFontAtIndex(TrueTypeCollection.java:127)
> at
> org.apache.fontbox.ttf.TrueTypeCollection.processAllFonts(TrueTypeCollection.java:109)
> at
> org.apache.pdfbox.pdmodel.font.FileSystemFontProvider.addTrueTypeCollection(FileSystemFontProvider.java:665)
> at
> org.apache.pdfbox.pdmodel.font.FileSystemFontProvider.scanFonts(FileSystemFontProvider.java:396)
> at
> org.apache.pdfbox.pdmodel.font.FileSystemFontProvider.<init>(FileSystemFontProvider.java:367)
> at
> org.apache.pdfbox.pdmodel.font.FontMapperImpl$DefaultFontProvider.<clinit>(FontMapperImpl.java:139)
> at
> org.apache.pdfbox.pdmodel.font.FontMapperImpl.getProvider(FontMapperImpl.java:158)
> at
> org.apache.pdfbox.pdmodel.font.FontMapperImpl.findFont(FontMapperImpl.java:416)
> at
> org.apache.pdfbox.pdmodel.font.FontMapperImpl.findFontBoxFont(FontMapperImpl.java:379)
> at
> org.apache.pdfbox.pdmodel.font.FontMapperImpl.getFontBoxFont(FontMapperImpl.java:353)
> at org.apache.pdfbox.pdmodel.font.PDType1Font.<init>(PDType1Font.java:127)
> {noformat}
> I have a theory why it happens, which is that addTrueTypeCollection() does
> not add the font as "*skipexception*" to the cache file because it's not done
> in the exception handler.
> Gili Tzabari suggested to use CRC32 instead of SHA512.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]