> -----Original Message-----
> From: Markus Kuhn [mailto:[EMAIL PROTECTED]]
...
> Your friendly amateur Ada language lawyer begs to differ. Ada does not
> dictate that Wide_Character'Size = 16. It only says that it "is a
> character type whose values correspond to the 65536 code positions of
> the ISO 10646 Basic Multilingual Plane (BMP)", but not what the memory
> representation of that type is. Wide_Character is an enumeration type
> with 2**16 values, i.e.
>
> type Wide_Character is (nul, soh ... FFFE, FFFF);
>
> Ada gurus will remember that the value range and the memory size can be
> handled rather independently in Ada, and I see no reason, why a compiler
> author could not add a
>
> for Wide_Chararcter'Size use 32;
>
> to his version of package Standard. You should of course get immediately
> a CONSTRAINT_ERROR exception if a value > U+FFFF found its way from C
> code into an Ada program.
Does that also mean that Character'Size can be any size *greater than*
or equal to 8 too?
The comment (in http://www.adahome.com/rm95/rm9x-A-01.html), that
"The first 256 positions have the same contents as type Character."
seems to imply that Character and Wide_character have the same storage
widths. So *both* Character (with a value limitation to 255) and
Wide_Character *could* really be, at the storage level, a UCS-2 character,
a UTF-16 code unit, or UTF-32 character...
> I personally think that sizeof(wchar_t) == 4 is the right thing to do in
> a C library and that Ada, Java, Win32, etc. will eventually have to find
> ways around their restrictions to 16-bit. UTF-16 is not always the right
> answer, because it is a multi-word encoding.
Well, I think that for Java and Win32 the answer will be/is that
UTF-16 is used (internally) for the string datatypes; and that Java's
"char" is (will be) a UTF-16 code unit (not necessarily a character).
Kind regards
/kent k
-
Linux-UTF8: i18n of Linux on all levels
Archive: http://mail.nl.linux.org/lists/