[EMAIL PROTECTED] (Florian Weimer) wrote on 05.08.01 in <[EMAIL PROTECTED]>:
> [EMAIL PROTECTED] (Kai Henningsen) writes:
>
> > * Do we need a "native" wide char encoding, too (mostly for Win32 where
> > it's UTF-16, but possibly also some Asian thing)?
>
> A single 'char' encoded in UTF-16? This sounds horrible.
I can't quite parse that.
Win32 has two standard encodings, one ("ANSI", though it isn't really; "A"
function postfix) an 8 bit (localized) encoding (best known example is
code page 1252), and UTF-16 (traditionally UCS-2) ("Unicode", "W" function
postfix). You get "CreateFileA" [1] to open files with char * and
"CreateFileW" to open files with (usually) wchar_t * filenames (and
"CreateFile" which is a macro that expands to one of those two depending
on global defines - IIRC, _UNICODE).
For some locales, there is no locale-specific 8 bit code page (I think
they substitute 1252 in those cases).
It seems that on NT-based versions of Win32, "A" functions are shims for
"W" functions; that is, the real native OS encoding is UTF-16.
As for the "Asian thing", I have a dim recollection of having heard of
some Asian charsets that really are 16 bit, not ISO 2022-style multibyte.
I don't claim to know if they are really used that way. That's why
"possibly".
[1] Yep, "CreateFile" is the basic function even for existing files, sort
of the other way around from Unix. I guess they think of "create file
handle"; nearly every function that creates a handle is CreateXXX.
MfG Kai
-
Linux-UTF8: i18n of Linux on all levels
Archive: http://mail.nl.linux.org/linux-utf8/