The Unicode 3.1 language tag characters are described in Unicode UAX
#27. It's obvious that iconv converters from UTF-8 to all non-ISO-2022
encodings shall silently ignore and throw away these characters. The
real question is what to do with the ISO-2022-JP-2 encoding.
My plan is to change the glibc ISO-2022-JP-2 encoder in such a way
that inside a "LANGUAGE j a" tagging, it tries to convert first to
JISX0208, JISX0212 before trying the Chinese and Korean encodings, and
similarly give a preference to Chinese characters sets inside
"LANGUAGE z h" and to the Korean character set inside "LANGUAGE k o".
Does this meet the requirements of glyph shape sensitive Japanese
users? What do you think, Tomohiro?
Bruno
-
Linux-UTF8: i18n of Linux on all levels
Archive: http://mail.nl.linux.org/linux-utf8/