On Friday, March 15, 2002, at 08:48 , Marco Cimarosti wrote: > O, no! At least one of them has a (super)natural origin: CJK ideographs > came > carved on the shell of a gigantic turtle which appeared in dream to Cang > Jie. :-)
That reminds me of a fact that Hanzi (or Kanji in Japanese) is equipped with capacity to generate new character simply by combining 'radicals' (or 'Bushu' in Japanese). Put 'heart' (心) next to 'life' (生) and you will get 'sex' (性), for instance. Unlike roman characters that are relatively static, Kanji is very dynamic when it comes to characters. So I can't help asking you guys this question; How will Unicode cope with this kind of dynamically changing character set? So far Kanji users get by with a limited set of encoded character sets, not because they are content with the current set but because it is so hard to push one character into the current set. When Japan Industrial Standard (JIS) upgraded JISX0208 (first one fixed in 1978. aka Old JIS) in 1990 (New JIS), it created a big chaos. And new chaos is subject to arise with JISX0212-1990 upgraded to JISX0213-2000. You may say this can be resolved by regarding each Kanji not as a character but a word (lexically speaking this does make sense) then use some sort of ligature to represent one. That way you can reduce the number of code point down to the number of Bushu. But this approach has already failed when Unicode 2.0 decided to give all theoretically possible Hangul distinct code points, unlike Unicode 1.0 which used ligature model to represent one char. As a result Hangul now even has more code points than Traditional Chinese. With this Unicode Consortium has lost a good reason to reject new proposal to add more characters. If elvish get the code points why not real, alive language get more? CJK has made the greatest compromise -- the compromise that hardly paid off in consequence -- when Unicode was first created. They accepted the code point sharing though that hardly make sense linguistically. Then Unicode 2.0 and Hangul Expansion, then Surrogate Pair. What's next? Making Unicode 128 bit like IPv6 address so you can include Tengwar and Klingon with less objection? I can't help but say give me a break. I confess I enjoyed this thread of whether Tengwar should be include in Unicode. It's fun. It's cute. But isn't this too much for those who accepted the compromise for UNIcode? Tengwar should wait till more critical issues are resolved. Many (including me ) would be pissed if Tengwar be added BEFORE Ciao-Ciao's poetries and Man-Yo-Shu become encodable in Unicode. Well, it may take decades, if not centuries, for Tengwar, Klingon and others to get a chance but so what? They won't go away after all of us here are dead. Dan the Man with Too Many Things to Encode Already

