As of v3.0.0-alpha3, the brilliant FontBox library is capable of parsing many data structures inside OTF/TTF fonts. However, not everything works smooth:
1. Unlike other table readers, the 'org.apache.fontbox.ttf.GlyphSubstitutionTable#read' method does not end with 'initialized = true' statement. As a result, all the subsequent invocations of 'org.apache.fontbox.ttf.TrueTypeFont#getGsubData' method results in GSUB table re-parsing which means it happens at least once on the font parsing and then every time the method is called. Is it a bug or some unobvious solution? 2. There are 2 closely related issues around the following behavior: 1. In order to load GSUB data from a font, the library requires a language for which glyph substitutions must be built. The language is chosen upon the script tags provided by the font. The mapping between script tags and languages is provided by 'org.apache.fontbox.ttf.model.Language' enumeration which currently supports only one language, and it is... Bengali. However, the library doesn't communicate the problem to the user directly (including logs). Instead, it silently falls back to 'org.apache.fontbox.ttf.model.GsubData#NO_DATA_FOUND' constant in method 'org.apache.fontbox.ttf.gsub.GlyphSubstitutionDataExtractor#getGsubData' making it quite hard to find out the root cause of the problem. So the question is would it be acceptable to throw an exception in such cases from the backward compatibility point of view? 2. As stated in 'org.apache.fontbox.ttf.model.Language' class's JavaDoc, to support a new language one should add a new enumeration item and provide corresponding 'org.apache.fontbox.ttf.gsub.GsubWorker' implementation. Indeed, the absence of the worker certainly leads to UnsupportedOperationException being thrown from 'org.apache.fontbox.ttf.gsub.GsubWorkerFactory#getGsubWorker' method. However, this makes it impossible to even load GsubData from a font which is critical for (at least my) FontBox use case. As a possible solution, I'd suggest introducing a default GsubWorker implementation that would perform a no-op substitution (emitting a WARN message into the log for clarity) for any language that is not explicitly supported by the library. Additionally, an 'isStrict' flag with default value 'tue' may be introduced to throw an exception instead of falling back to this new default implementation (much like the same named flag in 'org.apache.fontbox.ttf.TrueTypeFont#getUnicodeCmapLookup(boolean)' method). Does it sound reasonable? P.S. Is there a way to propose the described fixes in the form of a Pull Request like it is usually done in many open source projects on GitHub? This would make the discussion much closer to the code and thus significantly more productive. Cheers, Vladimir