On Wed, 4 Dec 2002, Maiorana, Jason wrote:
> If characters are ever introduced which have no precomposed codepoint, > then it will be difficult for a font to "normalize" them to one > glyph which has the appropriate internal layout. The font file itself > would then have to know about composition rules, such as when > X is composed with Y then Z, then use this glyph XYZ which has no > single codepoint in unicode. Have you ever heard of Opentype and AAT fonts? Modern font technologies and modern rendering engines (Pango, AAT, Uniscribe, Graphite) can all do that. Otherwise, how would Indic scripts be used at all? What you describe above is done by everyday by Pango, Uniscribe and AAT/ATSUI, Graphite. > For that reason, I dont like form D at all. I wonder how much space > it would take to represent every possible Jamo-combination, then just > do away with combining characters alltogether... No way!! The biggest blunder ever made by Korean nat'l standard body is to insist that 11,172 modern precomposed syllables be encoded in Unicode/10646. Next biggest blunder they made is to encode tens of totally unnecessary cluster-Jamos when only 17+11+17+ a few more would have been more than sufficient. Next stupid thing they did is to remove compatibility decomposition between cluster Jamos and basic Jamo sequences although they should be canonically(not just compatibly) equivalent. Now, you're saying that all possible combinations of them be encoded. How many? It's __infinite__ in theory. In practice, it could be around 1.5 milllion. That's more than the total number of codepoints available in 20.1 bit coded character set which is ISO 10646/Unicode. Jungshik Shin -- Linux-UTF8: i18n of Linux on all levels Archive: http://mail.nl.linux.org/linux-utf8/
