JS> I'm pretty sure there will be some objections to what you wrote
JS> about them because there are string manipulation routines written
JS> for them in use.
Yes, and they are all buggy.
As you know, Xlib contains some simple text manipulation routines for
a subset of ISO 2022 known as COMPOUND_TEXT. As all ISO 2022 subsets
known to man, woman or child, COMPOUND_TEXT is both too large to be
manageable and to small to be useful for multilingual text.
While this code has been regularly maintained for a good five years
(both by us and the commercial Unix vendors), we keep uncovering new
bugs in our COMPOUND_TEXT implementation. While we are committed to
maintaining the COMPOUND_TEXT code for the foreseeable future, we
recommend that new applications should use the Xlib facilities for
conversion to/from Unicode only.
(We have seriously considered dumping the internal COMPOUND_TEXT
processing and systematically going through Unicode, but this would
require handling some form of language tagging in order to remain
fully compatible with the ISO 2022-based implementation, especially in
the presence of Han characters.)
Juliusz
-
Linux-UTF8: i18n of Linux on all levels
Archive: http://mail.nl.linux.org/linux-utf8/