> -----Original Message-----
> From: Markus Kuhn [mailto:[EMAIL PROTECTED]]
...
> I am very happy with the added note that promises that no standard
> characters will be added beyond U-10FFFF.
Good! (Nitpicking: It's U+10FFFF (+ not -).)
> However, I'd prefer if as a
> consequence the entire code space starting at U-110000 upwards were
> declared to be reserved for private use outside the scope of UCS.
I can safely predict that that is not going to happen!
> We have now in wchar_t a nice infrastructure for handling 31-bit
> characters, and I do urge all implementors of UTF-8 encoders and
> decoders to keep them fully 31-bit transparent. UTF-16 is pretty
> irrelevant to the GNU/POSIX platform. The wc API was not designed to
> handle double-double-byte characters such as surrogate pairs.
> Why should
> Linux programmers destroy the potentially useful full 31-bit
> space, just
> because of silly interoperability concerns by the UTF-16
> crowd? They are
Why are interoperability concerns "silly"? Do you plan to live in
a Linux only world, never to communicate with systems running
Windows, MacOS, Epoc, or any major database? (All employ UTF-16.)
> just applying flawed logic IMHO: Private use characters are per
> definition non-interoperable anyway, independent whether they can be
They should still be interchangeable and storable & processable (after
"private agreements") also in systems primarily employing UTF-16.
Interoprability regards
/kent k
-
Linux-UTF8: i18n of Linux on all levels
Archive: http://mail.nl.linux.org/lists/