So glibc is broken. This doesn't make it a Unicode problem. On Sat, Nov 8, 2014 at 8:22 PM, Mike FABIAN <[email protected]> wrote:
> Philippe Verdy <[email protected]> さんはかきました: > > > note that tolower() and toupper() can only work one 1-character level, it > > is not recommended for use for changing case of plain text. > > > > For correct handling of locales, to upper and toupper should be replaced > by > > strtolower and strtoupper (or their aliases) which will be able to > process > > character clusters and contextual casing rules needed for a language or > > orthographic style > > Yes, thank you for explaining this. > > But these details of upper and lower casing cannot be expressed in the > “i18n” file of glibc: > > https://sourceware.org/git/?p=glibc.git;a=blob;f=localedata/locales/i18n > > For toupper and tolower, this file just has character -> character > mapping tables, for example the “tolower” table contains only > > (<U03A3>,<U03C3>) > > (i.e. mapping Σ U+03A3 -> σ U+03C3, never to the final sigma ς > U+03C2). > > More correct, detailed information about upper and lower case must come > from elsewhere, not from this “i18n” file in glibc. Using only the > information from this “i18n” file, not even the Greek sigma can be > handled correctly. > > Pravin and me want to update this “i18n” file to the latest > data from Unicode 7.0.0, doing it as correct as possible within > the limitations caused by this file and the ISO C standard. > > -- > Mike FABIAN <[email protected]> > ☏ Office: +49-69-365051027, internal 8875027 > 睡眠不足はいい仕事の敵だ。 > _______________________________________________ > Unicode mailing list > [email protected] > http://unicode.org/mailman/listinfo/unicode > -- Christopher Vance
_______________________________________________ Unicode mailing list [email protected] http://unicode.org/mailman/listinfo/unicode

