Hi,
Thanks for your comments. I have fixed (1) which is an obvious bug. Yes
you can make a PR although this is only used indirectly. You can also
comment in the JIRA issue
https://issues.apache.org/jira/browse/PDFBOX-4189
if you don't have an account, see
https://infra.apache.org/jira-guidelines.html#who (sorry for the
inconvenience)
Tilman
On 16.01.2023 13:36, Vladimir Plizga wrote:
As of v3.0.0-alpha3, the brilliant FontBox library is capable of parsing
many data structures inside OTF/TTF fonts. However, not everything works
smooth:
1. Unlike other table readers, the
'org.apache.fontbox.ttf.GlyphSubstitutionTable#read' method does not
end with 'initialized = true' statement. As a result, all the subsequent
invocations of 'org.apache.fontbox.ttf.TrueTypeFont#getGsubData' method
results in GSUB table re-parsing which means it happens at least once on
the font parsing and then every time the method is called. Is it a bug or
some unobvious solution?
2. There are 2 closely related issues around the following behavior:
1. In order to load GSUB data from a font, the library requires a
language for which glyph substitutions must be built. The language is
chosen upon the script tags provided by the font. The mapping between
script tags and languages is provided by
'org.apache.fontbox.ttf.model.Language' enumeration which currently
supports only one language, and it is... Bengali. However, the library
doesn't communicate the problem to the user directly (including logs).
Instead, it silently falls back to
'org.apache.fontbox.ttf.model.GsubData#NO_DATA_FOUND' constant in method
'org.apache.fontbox.ttf.gsub.GlyphSubstitutionDataExtractor#getGsubData'
making it quite hard to find out the root cause of the problem. So the
question is would it be acceptable to throw an exception in such
cases from
the backward compatibility point of view?
2. As stated in 'org.apache.fontbox.ttf.model.Language' class's
JavaDoc, to support a new language one should add a new enumeration item
and provide corresponding 'org.apache.fontbox.ttf.gsub.GsubWorker'
implementation. Indeed, the absence of the worker certainly leads to
UnsupportedOperationException being thrown from
'org.apache.fontbox.ttf.gsub.GsubWorkerFactory#getGsubWorker' method.
However, this makes it impossible to even load GsubData from a font which
is critical for (at least my) FontBox use case. As a possible
solution, I'd
suggest introducing a default GsubWorker implementation that
would perform
a no-op substitution (emitting a WARN message into the log for
clarity) for
any language that is not explicitly supported by the library.
Additionally,
an 'isStrict' flag with default value 'tue' may be introduced to throw an
exception instead of falling back to this new default
implementation (much
like the same named flag in
'org.apache.fontbox.ttf.TrueTypeFont#getUnicodeCmapLookup(boolean)'
method). Does it sound reasonable?
P.S. Is there a way to propose the described fixes in the form of a Pull
Request like it is usually done in many open source projects on GitHub?
This would make the discussion much closer to the code and thus
significantly more productive.
Cheers,
Vladimir
---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: users-h...@pdfbox.apache.org