On 25/10/2003 04:11, Philippe Verdy wrote:

From: "Peter Kirk" <[EMAIL PROTECTED]>


Have combining classes actually been defined for these characters?

This is of course exactly the same problem as with Hebrew vowel points
and accents, except that this time it applies to real living languages.
Perhaps it is time to do something about these combining classes which
conflict with the standard.



Do you mean officially documenting the correct (and strict) use of CGJ as
the only way to bypass the default order required by the combining classes
in normalized forms? It would be a good idea to document officially which
use of CGJ is superfluous and should be avoided in NF forms, and which use
is required.


This isn't what I meant, but I agree that some such definition would be a good idea.

What I had in mind was a probably hopeless plea for the wrongly assigned combining classes to be corrected. After all, the current assignments manifestly breach the standard, because marks with different classes interact typographically.

I wonder if it would in fact be possible to merge certain adjacent combining classes, as from a future numbered version N of the standard. That would not affect the normalisation of existing text; text normalised before version N would remain normalised in version N and later, although not vice versa. I know that this would break the letter of the current stability policy, but is this kind of backward compatibility actually necessary? The change could be sold to others as required for the internal consistency of Unicode.

If this were possible, the Hebrew and Arabic problem could be partly solved, in a non-optimal way but one which is less messy than the current situation. The idea would be for all Hebrew marks (i.e. all combining marks in 05B0-05C2) to be merged into one combining class, and similarly all Arabic harakat etc. including the new Arabic tone signs. This would make significant the relative orderings of multiple vowels (and meteg), and avoid the need for CGJ hacks. It would also allow the logical order of shadda, dagesh and sin and shin dots to be the canonical one, with significant advantages for collation etc as well as for rendering.

--
Peter Kirk
[EMAIL PROTECTED] (personal)
[EMAIL PROTECTED] (work)
http://www.qaya.org/





Reply via email to