> > > In our new implementation of Data.Char.isUpper and > friends, I made the > > > simplifying assumption that Char==wchar_t==Unicode. With > glibc, this > > > appears to be valid as long as (a) you set LANG to > something other than > > > "C" or "POSIX", and (b) you call setlocale() first. > > > > The glibc Info file says: > > > > The wide character character set always is UCS4, at least on > > GNU systems. > yes. with glibc, wchar_t is always unicode no matter what the locale. > better yet, all ISO C implementations define a handy C symbol to test > for this. if __STDC_ISO_10646__ is defined then wchar_t is always > unicode no matter what.
Sure, but as I've been saying, the implementation of glibc doesn't do this. In the C or POSIX locale, the ctype macros only recognise ASCII. Try it: #include <wctype.h> #include <stdio.h> #include <locale.h> main() { setlocale(LC_ALL,""); printf("%d\n", iswupper('A')); printf("%d\n", iswupper(0x391)); // Greek capital alpha printf("%d\n", iswupper(0x3B1)); // Greek small alpha printf("%d\n", iswlower(0x391)); // Greek capital alpha printf("%d\n", iswlower(0x3B1)); // Greek small alpha } $ LANG=en_GB ./a.out 1 1 0 0 1 $ LANG=C ./a.out 1 0 0 0 0 Should this be considered a bug in glibc? Cheers, Simon _______________________________________________ FFI mailing list [EMAIL PROTECTED] http://www.haskell.org/mailman/listinfo/ffi