I'm in the process of rewriting our encoding converter infrastructure in Rust. For a new implementation, it makes sense to support only the Web-exposed encodings, that is, the encodings specified in the Encoding Standard.
Currently, Firefox supports decoding non-Encoding Standard encodings in one place: in TrueType system font names. TrueType fonts can declare multiple names and the encoding of the names is specified on a per-name basis. Based on telemetry, Gecko sees these very rarely except for MacHebrew on Mac, which is due to the set of system-bundled fonts containing a font, Raanana, that declares one of its names in MacHebrew, and Firefox enumerates all system fonts. This is not telemetry of the font getting used but the decoder getting instantiated. It also happens that Raanana doesn't declare its Hebrew name in another way (Unicode or windows-1255), so removing support for non-Encoding Standard font names has the effect of removing the ability to specify Raanana by its Hebrew name in CSS. (I'm not actually sure if Raanana is the only macOS-bundled font like this.) https://hsivonen.com/test/moz/raanana.html indicates that Safari supports specifying Raanana by its Hebrew name but Chrome doesn't. That is, by making this change we'd end up with Chrome parity but would break Safari parity. Sites that have been tested in Chrome can already be expected to refer to Raanana at least by its Romanized name. Answers to anticipated questions: Won’t this make our TrueType support incomplete? We already don’t support all possible legacy TrueType font name encodings, and the names only matter for fonts installed on the system, so as long as present-day OS-bundled fonts work (well enough) we don't need to support everything that ever existed. How rare are the cases other than MacHebrew-on-Mac? Except for MacHebrew-on-Mac, each of the non-Roman, non-Cyrillic single-byte Mac encoding is seen in 0.00% of Firefox sessions. Even in the cases where this isn't absolutely zero (i.e. it's only zero to two decimal places), there are further mitigating factors: 1) The fonts may be declaring their name(s) also in Unicode or in a supported Windows legacy encoding. 2) Fonts that are this rare don't make sense to specify on Web sites except for fingerprinting purposes. (That is, when Web authors specify fonts that reside on the user's system as opposed to being downloaded from the Web, they tend to want to specify widely-available system-bundled fonts.) Are Web Fonts affected? No. The @font-face mechanism points to fonts by URL and ignores the names declared within the font files. Can Raanana still be used? Yes, by using the Romanized name "Raanana". Are there alternatives to removing support for the Hebrew name of Raanana? Yes, but they are more complex than just removing support. This is the opportunity for interested parties to advocate for the alternatives, though. The alternatives are: * Including a MacHebrew decoder in Gecko outside the normal converter infrastructure only for the purpose of decoding system font names. * Decoding MacHebrew font names on macOS only by calling the system converter API for MacHebrew only. * Including a special-case check for the name Raanana when enumerating the system fonts and adding a synthetic entry for the Hebrew name in Gecko's data structures when the Romanized name is seen (and doing the same for other macOS-bundled Hebrew fonts that have a Hebrew name in MacHebrew only if other such fonts exist). (This is my preference for an alternative to outright removal of the ability to specify Raanana by its Hebrew name.) -- Henri Sivonen hsivo...@hsivonen.fi https://hsivonen.fi/ _______________________________________________ dev-platform mailing list dev-platform@lists.mozilla.org https://lists.mozilla.org/listinfo/dev-platform