At Sun, 13 Jan 2002 14:49:15 +0100 (CET), peter karlsson wrote: > > Tomohiro KUBOTA: > > > Because the algorithm transliterations is not very good. > > I know. > > > And, many people in the world have to use a small subset of softwares > > only because such softwares support their native languages. > > We're talking about the web pages here, the only software that need > Unicode support here are the browsers, and most of them do have it (at > varying degrees). > > > Oh, very good. Please note that east Asian will need not only display > > support but also input support, i.e., XIM support. > > Yes, I'm very aware of that as well (although my direct experience with > IMs is limited). I have worked with the Unicode-adaption of our browser > for over a year. > > > (note there is a rival; ISO-2022 is a multilingual encoding scheme > > with much longer history). > > Yeah, and it's a mess, to be honest. This kind of "state-driven" (for > lack of a better word) encodings where you cannot easily sync (as you > can with UTF-8) is not something I like (the same goes for HZ, which is > just a "simplified" form of ISO-2022).
Note that browsers cannot be free from "state" even if they use Unicode. For example, rendering of Unicode unified CJK Han Ideographs (which are logically same character from a certain point of view but large part of them have significantly different glyphs) needs "state" of "language". Thus, though it is true ISO-2022 is very complex, please note Unicode is not so simple. If Unicode were less simpler than human natural languages, it means that Unicode has defects. > > I am also wrestling with a problem that Unicode doesn't have a > > relyable mapping table from/to Japanese legacy encodings. > > That's because of some poor design of the legacy encodings, not > Unicode, with multiple mappings of some characters. Never. Before appearance of Unicode, these encodings were identical, except for small number of private additional characters. For example, Shift_JIS and CP932 is identical if we don't think about conversion to/from Unicode. Most Japanese people even don't know the name of "CP932" and they think they are using Shift_JIS. What they think is correct. However, when Unicode comes, it stated "what you are using with Windows is CP932, not Shift_JIS." Unicode is the origin of this confusion by introducing many legacy encodings into Japan. (I am saying about the chapter of "Conversion tables differ between venders" in my document http://www.debian.or.jp/~kubota/unicode-symbols.html .) --- Tomohiro KUBOTA <[EMAIL PROTECTED]> http://www.debian.or.jp/~kubota/ "Introduction to I18N" http://www.debian.org/doc/manuals/intro-i18n/

