Wilhelm N��er writes:
> We�d like to point out that the literals are the most interesting point.
>
> Reason:
> missing functions can be implemented by everyone who thinks he/she
> needs them, but the literals and their structure must be defined
> by the compiler.
>
> Especially when you want to port existing code to Unicode you need
> a way to represent your usual ( english, 7 bit, ..) string literals in
> Unicode format.
>
> The format of the string literals determine the way other
> strings are handled.
>
> We DO NOT want to write hebraic, arabic ... glyphs in our sources!
The C/C++ compiler does what is specified in the language spec. The
language spec does not foresee a means to convert an ASCII string to
an uint16_t array at compile time. To work around this, you have two
options:
- Do the conversion at run time (in C++ possibly at static
initialization time).
- Do the conversion in a preprocessing stage before you call the
C/C++ compiler.
> There are good reasons why Java and IBM�s ICU have choosen UTF16 over
> any other implementatoin.
Java chose UCS-2, not UTF-16, the major reason being to be able to
address the n-th character of a string in constant time. Now they are
starting to add UTF-16 support...
Bruno
-
Linux-UTF8: i18n of Linux on all levels
Archive: http://mail.nl.linux.org/lists/