Sun, 16 Sep 2001 01:14:06 -0700, Carl W. Brown <[EMAIL PROTECTED]> pisze:
> If it can be demonstrated that there is a real need for an encoding
> like CESU-8 then is should be very different from UTF-8. How does
> SCSU for example sort?
SCSU encoding is non-deterministic and its representations can't
be compared lexicographically at all (logically equal strings might
compare unequal).
Ehh, we wouldn't have the problem with CESU-8 now if Unicode hadn't
been described as a 16-bit encoding in the past. I still think that
UTF-16 was a big mistake. Too bad that it still affects people who
avoid it.
We can't change the past, but I hope that at least UTF-8 processing can
be done without treating surrogates in any special way. Surrogates are
relevant only for UTF-16; by not using UTF-16 you should be free of
surrogate issues, except by having a silly unused area in character
numbers and a silly highest character number. Please don't spread
UTF-16 madness where it doesn't belong.
--
__("< Marcin Kowalczyk * [EMAIL PROTECTED] http://qrczak.ids.net.pl/
\__/
^^ SYGNATURA ZASTĘPCZA
QRCZAK