Re: UTF16 and GCC

Christoph Rohland Wed, 11 Jul 2001 23:34:01 -0700
Hi Bruno,

Sorry, I did not see the start of the discussion, but I would like to
comment nevertheless ;-)

On Wed, 11 Jul 2001, Bruno Haible wrote:
> Joseph S. Myers writes:
>> Systems for string literals in specified character sets have been
>> discussed on the WG14 reflector, but AFAICT without any working
>> papers yet even in the WG14 document register
> 
> At least their patch is something in the direction of portably
> written and still legible multilingual strings.
> 
> The L"wide string literal" syntax suffers from non-portability
> across systems and across locales, because ISO C fails to mandate
> that wchar_t is 32-bit ISO 10646.
> 
> But the 'u' prefix would better be used for UTF-8 string literals,
> not UTF-16 string literals. So I'm proposing the following syntax
> 
>                     u"UTF-8 string literal"
> 
> This way no extra 16-bit string functions are needed - the 8-bit
> str* functions in libc will do.

Why do you need a special utf8 string literal? UTF8 can be based on
standard string literals since in the ACSII range it is the same and
the basic entity is 8bit.

What we propose and need is a UTF16 string literal. See the following
paper for the rationale behind it:

http://wwwold.dkuug.dk/jtc1/sc22/wg20/docs/n830-utf-16-c.txt

It was discussed positively in the Unicode consortium:

http://www.unicode.org/unicode/members/L2001/01220.htm

(Sorry the latter is for Unicode Consortium members only...)

Greetings
                Christoph


-
Linux-UTF8:   i18n of Linux on all levels
Archive:      http://mail.nl.linux.org/linux-utf8/
Re: UTF16 and GCC

Reply via email to