Philippe Verdy <[email protected]> さんはかきました: > note that tolower() and toupper() can only work one 1-character level, it > is not recommended for use for changing case of plain text. > > For correct handling of locales, to upper and toupper should be replaced by > strtolower and strtoupper (or their aliases) which will be able to process > character clusters and contextual casing rules needed for a language or > orthographic style
Yes, thank you for explaining this. But these details of upper and lower casing cannot be expressed in the “i18n” file of glibc: https://sourceware.org/git/?p=glibc.git;a=blob;f=localedata/locales/i18n For toupper and tolower, this file just has character -> character mapping tables, for example the “tolower” table contains only (<U03A3>,<U03C3>) (i.e. mapping Σ U+03A3 -> σ U+03C3, never to the final sigma ς U+03C2). More correct, detailed information about upper and lower case must come from elsewhere, not from this “i18n” file in glibc. Using only the information from this “i18n” file, not even the Greek sigma can be handled correctly. Pravin and me want to update this “i18n” file to the latest data from Unicode 7.0.0, doing it as correct as possible within the limitations caused by this file and the ISO C standard. -- Mike FABIAN <[email protected]> ☏ Office: +49-69-365051027, internal 8875027 睡眠不足はいい仕事の敵だ。 _______________________________________________ Unicode mailing list [email protected] http://unicode.org/mailman/listinfo/unicode

