> -----Original Message----- > From: Keld Jørn Simonsen [mailto:[EMAIL PROTECTED]] ... > > > If wchar_t is Unicode, then the C compiler will have the macro > > > > > > __STDC_ISO_10646__ An integer constant of the form yyyymmL > > > > On glibc-2.1.3 wchar_t is Unicode but the macro is not defined. > > Moreover, if it's not Unicode then I see no good way to sensibly > > convert between its encoding and Unicode anyway. > > Well, my understanding is that the new glibc does not run Unicode > but UCS-4. Unicode is inherently 16 bit - I hope they someday > would step into the 32 bit world, but have seen no signs of it. Maybe that's because you are not looking... ;-) Unicode does have a preference for UTF-16. However, for a long time UTF-8 is also an encoding form for Unicode, and UTF-32 (UCS-4 bounded to the first 17 planes) is very much in the making, though not formally part of the Unicode standard yet. See http://www.unicode.org/unicode/reports/tr19/. UCS-2 used to be an encoding form for Unicode (and then the only one). But that has long ago been superceded by UTF-16 as the preferred (but not only) encoding form. UCS-4 was never an encoding form for Unicode, but UTF-32 is. UTF-8 limited to the first 17 planes has been a Unicode encoding form ever since Unicode also moved from UCS-2 to UTF-16, i.e. since Unicode 2.0. /kent k - Linux-UTF8: i18n of Linux on all levels Archive: http://mail.nl.linux.org/lists/
