FontBox: GSUB handling issues

Vladimir Plizga Mon, 16 Jan 2023 04:37:14 -0800

As of v3.0.0-alpha3, the brilliant FontBox library is capable of parsing
many data structures inside OTF/TTF fonts. However, not everything works
smooth:


   1. Unlike other table readers, the
   'org.apache.fontbox.ttf.GlyphSubstitutionTable#read' method does not
   end with 'initialized = true' statement. As a result, all the subsequent
   invocations of 'org.apache.fontbox.ttf.TrueTypeFont#getGsubData' method
   results in GSUB table re-parsing which means it happens at least once on
   the font parsing and then every time the method is called. Is it a bug or
   some unobvious solution?

   2. There are 2 closely related issues around the following behavior:
      1. In order to load GSUB data from a font, the library requires a
      language for which glyph substitutions must be built. The language is
      chosen upon the script tags provided by the font. The mapping between
      script tags and languages is provided by
      'org.apache.fontbox.ttf.model.Language' enumeration which currently
      supports only one language, and it is... Bengali. However, the library
      doesn't communicate the problem to the user directly (including logs).
      Instead, it silently falls back to
      'org.apache.fontbox.ttf.model.GsubData#NO_DATA_FOUND' constant in method
      'org.apache.fontbox.ttf.gsub.GlyphSubstitutionDataExtractor#getGsubData'
      making it quite hard to find out the root cause of the problem. So the
      question is would it be acceptable to throw an exception in such
cases from
      the backward compatibility point of view?

      2. As stated in 'org.apache.fontbox.ttf.model.Language' class's
      JavaDoc, to support a new language one should add a new enumeration item
      and provide corresponding 'org.apache.fontbox.ttf.gsub.GsubWorker'
      implementation. Indeed, the absence of the worker certainly leads to
      UnsupportedOperationException being thrown from
      'org.apache.fontbox.ttf.gsub.GsubWorkerFactory#getGsubWorker' method.
      However, this makes it impossible to even load GsubData from a font which
      is critical for (at least my) FontBox use case. As a possible
solution, I'd
      suggest introducing a default GsubWorker implementation that
would perform
      a no-op substitution (emitting a WARN message into the log for
clarity) for
      any language that is not explicitly supported by the library.
Additionally,
      an 'isStrict' flag with default value 'tue' may be introduced to throw an
      exception instead of falling back to this new default
implementation (much
      like the same named flag in
      'org.apache.fontbox.ttf.TrueTypeFont#getUnicodeCmapLookup(boolean)'
      method). Does it sound reasonable?

P.S. Is there a way to propose the described fixes in the form of a Pull
Request like it is usually done in many open source projects on GitHub?
This would make the discussion much closer to the code and thus
significantly more productive.

Cheers,
Vladimir

FontBox: GSUB handling issues

Reply via email to