RE: CGJ - Combining Class Override

Jony Rosenne Sat, 25 Oct 2003 20:43:40 -0700

Sorry, Philippe, I had meant a separate character for a "right Meteg", not a
separate control character. Does this mean we agree?


Jony

> -----Original Message-----
> From: Philippe Verdy [mailto:[EMAIL PROTECTED] 
> Sent: Saturday, October 25, 2003 5:58 PM
> To: Jony Rosenne
> Cc: [EMAIL PROTECTED]
> Subject: Re: CGJ - Combining Class Override
> 
> 
> From: "Jony Rosenne" <[EMAIL PROTECTED]>
> 
> > For the record, I repeat that I am not convinced that the CGJ is an 
> > appropriate solution for the problems associated with the 
> right Meteg. 
> > I tend to think we need a separate character.
> 
> Yes, it's possible to devize another character explicitly to 
> override very precisely the ordering of combining classes. 
> But this still does not change the problem, as all the 
> existing NF* forms in existing documents using any past or 
> present version of Unicode MUST remain in NF* form with 
> further additions.
> 
> If one votes for a separate control character, it should come 
> with precise rules describing how such override can/must be 
> used, so that we won't break existing implementations. This 
> character will necessary have a combining class 0, but will 
> still have a preceding context. Strict conformance for the 
> new NF* forms must still obey to the precise ordering rules, 
> and this character, whatever its form, shall not be used 
> everytime it is not needed, i.e. when the existing
> NF* forms still produce the correct logical order (that's why 
> its use should then be restricted to a list of known 
> combining characters that may need this override).
> 
> Call it <CCO> "Combining Class Override" ? This does not 
> change the problem: this character should be used only 
> between pairs of combining characters, such as the encoded sequence:
>     {c1, CCO, c2}
> shall conform to the rules:
>     (1) CC(c1) > CC(c2) > 0,
>     (2) c1 is known (listed by Unicode?) to require this override
>     to keep the logical ordering needed for correct text semantics.
> 
> The second requirement should be made to avoid abuses of this 
> character. But it is not enforceable if CGJ is kept for this function.
> 
> The CCO character should then be made "ignorable" for
> collation or text breaks, so that collation keys will become:
>     [ CK(c1), CK(c2) ]  for {c1, CCO, c2}
>     [ CK(c2), CK(c1) ]  for {c2, c1} and {c1, c2} if normalized
> 
> Legacy applications will detect a separate combining sequence 
> starting at CCO, but newer applications will still know that 
> both sequences are describing a single grapheme cluster.
> 
> This knowledge should not be necessary except in grapheme 
> renderers, or in some input methods that will allow users to
> enter:
>     (1) keys <c2><c1> producing the normalized text {c2, c1}
>          as before;
>     (2) keys <c1><c2> producing the normalized text {c1, CCO, c2}
>          instead of {c2, c1} as before;
>     (3) optionally support a keystroke or selection system to swap
>          combining characters.
> 
> If this is too complex, the only way to manage the situation 
> is to duplicate existing combining characters that cause this 
> problem, and I think this may go even worse as this 
> duplication may need to be combinatorial and require a lot of 
> new codepoint assignments.
> 
> 
>

RE: CGJ - Combining Class Override

Reply via email to