On Wednesday, August 06, 2003 12:38 PM, Kent Karlsson <[EMAIL PROTECTED]> wrote: > Since I think <a, ring above, cgj, dot below> should be canonically > equivalent to <a, dot below, cgj, ring above>, but cannot be made > so (now), the only ways out seem to be to either formally deprecate > CGJ, or at least confine it to very specific uses. Other occurrences > would not be ill-formed or illegal, but would then be non-conforming.
There's a way to specify that <A, RingAbove, CGJ, DotBelow> is well-formed, but not <A, DotBelow, CGJ, RingAbove>: a CGJ can be authorized in a combining sequence only if it precedes a base character, or is precedes a combining character which combining class is strictly lower than the combining class of the previous character. So, with this definition, with the combining classes indicated: - <A=0, RingAbove=230, CGJ=0, DotBelow=220> is well-formed because 220 < 230. It is distinct from: <A=0, RingAbove=230, DotBelow=220>, whose canonical ordering is <A=0, DotBelow=220, RingAbove=230> - <A=0, DotBelow=220, CGJ=0, RingAbove=230> is ill-formed because 230 > 220. The CGJ is superfluous and should be removed to create: <A=0, DotBelow=220, RingAbove=230> - <A=0, DotBelow=220, CGJ=0, Cedilla=220> is ill-formed because 220 = 220. The CGJ is superfluous and should be removed to create: <A=0, DotBelow=220, Cedilla=220> which is well-formed and in canonical order. - <A=0, Cedilla=220, CGJ=0, DotBelow=220> is ill-formed because 220 = 220. The CGJ is superfluous and should be removed to create: <A=0, Cedilla=220, DotBelow=220> which is well-formed and in canonical order. This "well-formed" rule would clearly give an exact semantic for CGJ, used in the middle of a combining sequence as the only way to bypass the canonical reordering of combining characters.

