On Fri, Jun 13, 2003 at 03:18:59PM +0100, Markus Kuhn wrote:
> 
> [Fortunately though, UTF-16 remains of little bother to anyone in the
> Unix/Plan9 world, where UTF-16 and it's 0x10ffff limit are virtually
> unheard of, except for the occasional shaking of heads, and very likely
> will remain so. The reduction from 0xffffffff to 0x7fffffff was
> technically reasonable though, as it permits the use of signed 32-bit
> integer types. UTF-16 remains an ugly misscarriage, because by placing
> the surrogates not at the end of the 16-bit space but into the middle of
> the code range, it leads to an incompatible binary sorting order in
> B-trees with UCS-4 and UTF-8 and therefore is useless for database
> applications that want to hide the internal encoding from the user of
> B-tree iterators.]

Hmm, UTF-8 is now being redefined as only being 21-bit. The UTF-8
definition of the internet is being redefined to refer to the Unicode
specification. I believe tho, that UTF-8 as it is defined today in
ISO/IEC 10646 still is 31 bit. But Unicode is taking over the world.
Probably this will also happen in the Unix/Plan9 world.

Best regards
keld
--
Linux-UTF8:   i18n of Linux on all levels
Archive:      http://mail.nl.linux.org/linux-utf8/

Reply via email to