JS> I'm pretty sure there will be some objections to what you wrote
JS> about them because there are string manipulation routines written
JS> for them in use.

Yes, and they are all buggy.

As you know, Xlib contains some simple text manipulation routines for
a subset of ISO 2022 known as COMPOUND_TEXT.  As all ISO 2022 subsets
known to man, woman or child, COMPOUND_TEXT is both too large to be
manageable and to small to be useful for multilingual text.

While this code has been regularly maintained for a good five years
(both by us and the commercial Unix vendors), we keep uncovering new
bugs in our COMPOUND_TEXT implementation.  While we are committed to
maintaining the COMPOUND_TEXT code for the foreseeable future, we
recommend that new applications should use the Xlib facilities for
conversion to/from Unicode only.

(We have seriously considered dumping the internal COMPOUND_TEXT
processing and systematically going through Unicode, but this would
require handling some form of language tagging in order to remain
fully compatible with the ISO 2022-based implementation, especially in
the presence of Han characters.)

                                        Juliusz
-
Linux-UTF8:   i18n of Linux on all levels
Archive:      http://mail.nl.linux.org/linux-utf8/

Reply via email to