I'm in the process of rewriting our encoding converter infrastructure
in Rust. For a new implementation, it makes sense to support only the
Web-exposed encodings, that is, the encodings specified in the
Encoding Standard.

Currently, Firefox supports decoding non-Encoding Standard encodings
in one place: in TrueType system font names. TrueType fonts can
declare multiple names and the encoding of the names is specified on a
per-name basis.

Based on telemetry, Gecko sees these very rarely except for MacHebrew
on Mac, which is due to the set of system-bundled fonts containing a
font, Raanana, that declares one of its names in MacHebrew, and
Firefox enumerates all system fonts. This is not telemetry of the font
getting used but the decoder getting instantiated.

It also happens that Raanana doesn't declare its Hebrew name in
another way (Unicode or windows-1255), so removing support for
non-Encoding Standard font names has the effect of removing the
ability to specify Raanana by its Hebrew name in CSS. (I'm not
actually sure if Raanana is the only macOS-bundled font like this.)

https://hsivonen.com/test/moz/raanana.html indicates that Safari
supports specifying Raanana by its Hebrew name but Chrome doesn't.
That is, by making this change we'd end up with Chrome parity but
would break Safari parity. Sites that have been tested in Chrome can
already be expected to refer to Raanana at least by its Romanized
name.


Answers to anticipated questions:


Won’t this make our TrueType support incomplete?

We already don’t support all possible legacy TrueType font name
encodings, and the names only matter for fonts installed on the
system, so as long as present-day OS-bundled fonts work (well enough)
we don't need to support everything that ever existed.


How rare are the cases other than MacHebrew-on-Mac?

Except for MacHebrew-on-Mac, each of the non-Roman, non-Cyrillic
single-byte Mac encoding is seen in 0.00% of Firefox sessions. Even in
the cases where this isn't absolutely zero (i.e. it's only zero to two
decimal places), there are further mitigating factors: 1) The fonts
may be declaring their name(s) also in Unicode or in a supported
Windows legacy encoding. 2) Fonts that are this rare don't make sense
to specify on Web sites except for fingerprinting purposes. (That is,
when Web authors specify fonts that reside on the user's system as
opposed to being downloaded from the Web, they tend to want to specify
widely-available system-bundled fonts.)


Are Web Fonts affected?

No. The @font-face mechanism points to fonts by URL and ignores the
names declared within the font files.


Can Raanana still be used?

Yes, by using the Romanized name "Raanana".


Are there alternatives to removing support for the Hebrew name of Raanana?

Yes, but they are more complex than just removing support. This is the
opportunity for interested parties to advocate for the alternatives,
though. The alternatives are:

* Including a MacHebrew decoder in Gecko outside the normal converter
infrastructure only for the purpose of decoding system font names.

* Decoding MacHebrew font names on macOS only by calling the system
converter API for MacHebrew only.

* Including a special-case check for the name Raanana when enumerating
the system fonts and adding a synthetic entry for the Hebrew name in
Gecko's data structures when the Romanized name is seen (and doing the
same for other macOS-bundled Hebrew fonts that have a Hebrew name in
MacHebrew only if other such fonts exist). (This is my preference for
an alternative to outright removal of the ability to specify Raanana
by its Hebrew name.)

-- 
Henri Sivonen
hsivo...@hsivonen.fi
https://hsivonen.fi/
_______________________________________________
dev-platform mailing list
dev-platform@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-platform

Reply via email to