[trying to catch up on *some* of the e-mails here...]

Fran�ois Yergeau wrote:

> This little-known fact (along with the better-known fact that not all
> non-zero-ccc-characters do take part in existing precomposed 
> characters) has
> prompted the W3C's Character Model spec to define "composing 
> characters", a
> concept somewhat distinct from Unicode's combining 
> characters.  Appendix C
> at
....
> contains the definition as well as a list of the characters with
> ccc=0 that do take part in existing compositions; U+102E is there, of
> course, as well as the above-mentionned Hangul plus some others.

Hmm, Hangul. Now, the composition rules for Hangul ARE special.
That's why it's not just the case that V and T Jamos are combining,
and all the rest of Hangul characters just regular non-combining.
ALL of the L, V, T, LV, and LVT Hangul characters are CONJOINING.
E.g. an L followed by an LVT is a SINGLE Hangul syllable. The notion
of "composing characters" in that appendix C misses that point,
and goes back to an old proposed (but never in Unicode) model
where there where just the Ls, Vs, and Ts, with the latter two
combining, and    L T V V T   would be a single Hangul syllable.
Unfortunately, that is plain wrong in the adopted model for Hangul.
However,  L L V V T  *is* a single Hangul syllable, so is L LV T T, and
LVT T, and ...  Indeed, an L LVT (e.g.) may normalise to (another) LVT.


                /kent k


Reply via email to