>> For that reason, I dont like form D at all. I wonder how much space >> it would take to represent every possible Jamo-combination, then just >> do away with combining characters alltogether... > No way!! The biggest blunder ever made by Korean nat'l standard body >is to insist that 11,172 modern precomposed syllables be encoded >in Unicode/10646. Next biggest blunder they made is to encode tens >of totally unnecessary cluster-Jamos when only 17+11+17+ a few more >would have been more than sufficient. Next stupid thing they did is >to remove compatibility decomposition between cluster Jamos and basic >Jamo sequences although they should be canonically(not just compatibly) >equivalent. Now, you're saying that all possible combinations of them >be encoded. How many? It's __infinite__ in theory. In practice, it could >be around 1.5 milllion. That's more than the total number of codepoints >available in 20.1 bit coded character set which is ISO 10646/Unicode.
Wow, ok, I guess that idea wont work for Korean. Also, since glyph swapping has to be done for merely adjacent characters, doing it for combining ones must be a relatively minor concern. Out of curiousity, how many of those Korean letters are actually made use of by the language? 1.5 million sounds higher than any number of phoneme's that a human can produce.... (what if the cluster jamo's were dropped?) Are we heading for a long-run scenario, where Form-D becomes canonical, and all the old pre-composed codepoints are deprecated? NF-C seems to be getting more and more entrenched from what I can tell... -- Linux-UTF8: i18n of Linux on all levels Archive: http://mail.nl.linux.org/linux-utf8/
