It seems that with a font that has only a 3, 0 cmap subtable (and may be
some macintosh subtables), then HB will automatically do the shift by
F000 (in the function get_glyph_from_symbol) for code points below
U+00FF that are not mapped by the subtable.
It is clear that when U+0041 A is set with a symbol font, then that
U+0041 has actually the semantics of a PUA code point, and certainly
should not be treated as an "A". That's the whole point of a 3,0 cmap
subtable.
Consider an HTML page. The font-family is only a request and there is no
guarantee that the actual font will or will not be a symbol font. Thus
the semantic of the HTML page can change depending on the browser
environment. Outside a browser, it seems that the safe treatment is
therefore to consider all code points below U+00FF as PUA, which is
clearly not tenable. So in that environment, I think that the shift
should not be done. Of course, U+F041 should work.
Note that behavior of Word 2016 on Windows is actually more elaborate:
enter U+0041, and set it with a non-symbol font; copy/paste or save to a
text file, and the result is U+0041; but set this A in a symbol font,
and copy/paste or save to a text file, and the result is U+F041.
I think that the shift should be controllable by the client, rather than
systematically applied. I don't have a strong opinion about the default
behavior (i.e. when HB's client does not specify whether the shift
should be done or not).
Thoughts?
Thanks,
Eric.
_______________________________________________
HarfBuzz mailing list
HarfBuzz@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/harfbuzz