...Agreed.
It's just a shame that what was considered as equivalent in the Korean
standards is considered as canonically distinct (and even compatibility
dictinct) in Unicode. This means that the same exact abstract Korean text
can have two distinct representation in Unicode and there's no way to match
these Unicode representations together. And also that whan mapping Korean
charsets to Unicode, care must be done, before making the mapping, that all
compound jamaos will be used each time it is possible.
I note the following which is part of the text explaining C10:If now the text is stored and handled entirely in Unicode without returning to the KSC standard, you won't have any other tool than just UCA to collate strings (but collation does not produces strings, just collation weights, and there's currently no tool to reverse a list of weights back to an Unicode string...
...
All processes and higher-level protocols are required to abide by C10 as a minimum.
However, higher-level protocols may define additional equivalences that do not
constitute modifications under that protocol. For example, a higher-level protocol
may allow a sequence of spaces to be replaced by a single space.
Presumably a higher level protocol could transform Korean text into a standardised form, doing what (in your opinion and mine at least) Unicode normalisation ought to have done.
-- Peter Kirk [EMAIL PROTECTED] (personal) [EMAIL PROTECTED] (work) http://www.qaya.org/

