Michael B. Allen writes: > So the first column > is a big endian representation of the multibyte sequence corresponding > to the UCS code in the right column? So I could generate the maps from > that information and use the libiconv *_mbtowc functions to do multibyte > conversions.
Yes. > Incidentally why is there no ISO-2022-JP.TXT? ISO-2022-JP can not be described by such a table. It's a stateful encoding. Even with an expat that understands other encodings than UTF-8 and ISO-8859-1, people should continue using UTF-8 for their XML files. Quoting from http://www.w3.org/TR/charmod/ : "When specifications choose to allow encodings other than Unicode encodings, implementers should be aware that the correspondence between the characters of a legacy encoding and Unicode characters may in practice depend on the software used for transcoding. See the Japanese XML Profile [http://www.w3.org/TR/japanese-xml/] for examples of such inconsistencies." Bruno -- Linux-UTF8: i18n of Linux on all levels Archive: http://mail.nl.linux.org/linux-utf8/
