On Thu, 25 Dec 2003, Philippe Verdy wrote: > "Michael Everson" <[EMAIL PROTECTED]> > > We have encoded 70,000 of them. > > All depends on the way you define characters. Most ideographs are composed, > but Unicode and the CJK unification working groups have failed for now to > define a coherent definition of how these characters really compose, so we
Is there a well-defined set of components that can unambiguously compose 'up' all Chinese characters that can be agreed upon by all interested people (or at least national standard bodies)? > are still assisting to an always exploding number of compound ideographs, > created everyday by Han users. > > If Latin characters were counted the way Han is, we would probably reach > similar (may be even more) composed "characters". It's just infortunate that > Han lacks a way to describe its composition model You're trying to bridge the enormous gulf between alphabetic scripts like Latin, Greek, Cyrillic on the one hand and 'isolating' (it's my own word) script like Chinese. It's true that Chinese characters have some 'phonetic' characteristics, but one of important things to consider is how Chinese characters have been perceived by their users. It might be helpful to 'decompose' / 'disassemble' Chinese characters into smaller components when you design fonts (Bitstream or some other foundries tried this and somebody at Stanford also had a web page on this), but I don't think there's anything fundamentally wrong with the current 'encoding' model of Chinese characters in Unicode/10646. > (it used to be the case too for the Hangul Alphabet, When was that the case? It never was. > but recent works seem to demonstrate that the > complexity of Hangul is just superficial in Unicode but forgets the actual "recent" works of whom ??? Philippe, it may be recent and new to you (I'm sorry to say, but a lot of things you've been 'discovering' are common knowledge to Koreans and a lot of others) but it's been that way since the invention of Korean script in 1443. Jungshik

