Christoph Rohland writes:

> So if you assume that the source file is in UTF-8 normal string
> literals should be UTF-8.

Yes, but only if the compiler is gcc, and no "coding:" marker is at
the top of the file, and no overruling command line option has been
given.

> And this case would be handled without special casing, right?

The internal processing for UTF-8 string literals in this case would
be trivial. But the internal processing for UTF-16 or UCS-4 string
literals is not complicated either.

The important point is that there be an agreement across several
compiler vendors what u8"...", u16"..." and u32"..." mean and how the
types are called. (Can't we use uint_least16_t instead of utf16_t?)

> For UTF-8 see above, for UCS-4 I thought the wchar_t is the right
> representaion.

Currently only on glibc systems. wchar_t == UCS-4 is only a
recommendation in ISO C 99, not mandatory (unfortunately).

Bruno
-
Linux-UTF8:   i18n of Linux on all levels
Archive:      http://mail.nl.linux.org/linux-utf8/

Reply via email to