On Aug 7 11:28, Corinna Vinschen wrote: > On Aug 5 21:06, Thomas Wolff wrote: > > Am 04.08.2017 um 19:01 schrieb Corinna Vinschen: > > > This shouldn't matter to you, just keep it in place. It's a historical, > > > low footprint conversion for japanese characters without pulling in the > > > unicode stuff. Not used on Cygwin so just ignore. > > I had noticed meanwhile that this is not active in Cygwin, but it's broken > > anyway for multiple reasons: > > * platforms for which wchar_t is not Unicode should be explicitly listed > > * if used, the transformation needs to be applied to all non-Unicode > > locales (also Chinese, Korean, and even 8-bit locales such as *.CP1252) > > * for towupper and towlower, the result must be back-transformed into the > > respective locale encoding > > * particulary the locale-specific _l functions inconsistently do not use > > the transformation but have this note: > > No, no, no. The functionality is restricted to certain use-cases and > always was. It was a paid-for customer extension back in the day and it > was *sufficient* for the use-cases. It's not clear how many newlib > users are still using it, but it's not a good idea to remove it without > checking first. That means, ask on the newlib mailing list how many are > using the historical jp2uc code, and if we don't get a reply within, > say, a month, we can probably nuke it.
To clarify where we're coming from: If you look into newlib/libc/locale/locale.c, function __loadlocale, you'll notice that outside of Cygwin, only six single/double/multi-bytes codesets are supported at all: ASCII ISO-8859-1 EUCJP JIS SJIS UTF-8 The multichar/widechar conversion functions for EUCJP, JIS and SJIS were implemented to have a low footprint in the first place, see, for instance, __sjis_wctomb in newlib/libc/stdlib/wctomb_r.c. This is all about simplification for small targets. There was never a requirement that converting a UTF-8 char to wchar_t, and converting the equivalent SJIS char to wchar_t would result in the same wide char. Consequentially, Cygwin does not use these conversion functions. Rather it uses Windows conversion functions, see the conversion functions in winsup/cygwin/strfuncs.cc, to get a consistent wide char representation (UTF-16). Another side-effect is that Cygwin does not support JIS at all, only SJIS, see the comment in strfuncs.cc. Corinna -- Corinna Vinschen Please, send mails regarding Cygwin to Cygwin Maintainer cygwin AT cygwin DOT com Red Hat
Description: PGP signature