Re: Proposal for 2 Byte Unicode implementation in gcc and glibc

Florian Weimer Thu, 10 Aug 2000 02:39:33 -0700

  Karlsson Kent - keka <[EMAIL PROTECTED]> writes:

> > The functions which have to handle character
> > properties with wchar_t can and should expect precomposed input
> > somthing which is not at all possible with UTF-16.
> 
> ???
> 
> Whether "precomposed input" (or to be more precise, input in Normal
> Form C; where you in general WILL find combining characters!!) is
> used or not is *completely orthogonal* to the issue of whether UTF-8,
> UTF-16, UTF-32, or even SCSU is used for the string representation.

The problem is the C API: You can't query character properties of
surrogate pairs using isupper() and friends.  Other languages using
UCS-2 or UTF-16 have probably a similar problem with surrogates.
-
Linux-UTF8:   i18n of Linux on all levels
Archive:      http://mail.nl.linux.org/lists/

Re: Proposal for 2 Byte Unicode implementation in gcc and glibc

Reply via email to