For the record, I repeat that I am not convinced that the CGJ is an appropriate solution for the problems associated with the right Meteg. I tend to think we need a separate character.
Jony > -----Original Message----- > From: [EMAIL PROTECTED] > [mailto:[EMAIL PROTECTED] On Behalf Of Philippe Verdy > Sent: Saturday, October 25, 2003 1:12 PM > To: Peter Kirk > Cc: [EMAIL PROTECTED] > Subject: Re: New contribution N2676 > > > From: "Peter Kirk" <[EMAIL PROTECTED]> > > Have combining classes actually been defined for these characters? > > > > This is of course exactly the same problem as with Hebrew > vowel points > > and accents, except that this time it applies to real living > > languages. Perhaps it is time to do something about these combining > > classes which conflict with the standard. > > Do you mean officially documenting the correct (and strict) > use of CGJ as the only way to bypass the default order > required by the combining classes in normalized forms? It > would be a good idea to document officially which use of CGJ > is superfluous and should be avoided in NF forms, and which > use is required. > > 1) This will affect only the input methods for those > languages that need to "swap" the standard order of combining > characters to keep their logical order (all this will require > is a additional input control that will try swapping > ambiguous orders). > > 2) A complete documentation may need to specify which pairs > of combining characters are affected (this should list the > pairs of combining characters <c1, c2> where CC(c1) > CC(c2) > and that require to be encoded <c1, CGJ, c2> to be kept in > logical order, as the sequence <c1, c2> will be reordered > into <c2, c1> in normalized forms. > > 3) The other issue would be that there may exist other > combining characters than those in this pair. Suppose I want > to represent <base, c1, c2, c3>, where CC(c1) > CC(c2), but > c3 does not have a conflicting pair in the previous list. > Should it be encoded as <base, c1, CGJ, c2, c3> or as <base, > c1, c3, CGJ, c2>? As the standard normalization algorithm > cannot be changed, both sequences will be possible with the > NF forms, even though they represent the same character. > > One could design an extra normalization step to force one > interpretation (so that only combining characters with > conflicting combining classes that have been forced "swapped" > will appear after CGJ, all other diacritics being encoded > preferably in the first sequence before the CGJ). > > This extra step should not be part of the NF forms (because > Unicode states that normailzed forms will be kept normalized > in all further versions of Unicode), but this could be named > differently, by describing a system in which extra > normalization steps may be applied that may change NF forms > into other "equivalent" sequences also in normalized form. > > > >

