Paul Eggert asked: > > By the way, how important is it to support awful encodings like > > shift-JIS that contain bytes that look like '\'? If we don't have to > > support these encodings any more, things get a bit easier.
Here we are talking about locale encodings, and Shift_JIS (as well as SHIFT_JISX0213) are not usable as a locale encoding in glibc. See e.g. [1], [2]. That's the reason why no Shift_JIS locale is listed in glibc/localedata/SUPPORTED. [3] Florian Weimer wrote: > There is a Shift-JIS variant which is ASCII-transparent (Windows-31J, > it's also specified by WhatWG/HTML5), so from a glibc point of view, it > would be just an ordinary charset like any other. > > But feedback we have received is that the users who want Shift-JIS > really want the original thing. > > We do not presently support either variant downstream, but one potential > way forward would be to turn Windows-31J into a fully supported glibc > charset with a corresponding ja_JP locale (which would imply downstream > support as well), and just hope that it displaces the original Shift-JIS > in the future. I don't think there's a real need for that. In the years 1995 ... 2005 there was a lot of resistence against Unicode in Japan, because Unicode maps several slightly differently looking glyph images to the same glyph/character (even for Western encodings, for example the Polish accents look a bit different than the French ones), and - at the time - Unicode did not have means to disambiguate these, thus people complained about "characters are rendered incorrectly if you use Unicode". This has been resolved for more than 10 years already. Bruno [1] https://sourceware.org/bugzilla/show_bug.cgi?id=3140 [2] https://sourceware.org/legacy-ml/libc-alpha/2000-10/msg00311.html [3] https://sourceware.org/git/?p=glibc.git;a=blob;f=localedata/SUPPORTED