Arno Schmitt noted: > The marks in the Arabic bloc are not well organized;
A well-known fact that has resulted from the prior legacy for Arabic encoding brought into Unicode, followed by twenty years of incremental encoding of additional marks, as evidence has been brought to bear and proposals for encoding have been made. > they belong to eleven mark classes, eight for marks above the > base character, three for marks below. > In Unicode logic marks within a class with a lower number should > be closer to the base character than those within a class with a > higher number. This is not true. The fact that it is not true essentially moots most of the argumentation that follows. For Arabic and Hebrew in particular, the history of canonical combining class assignments for marks has been complicated, but all of the "fixed position" combining class assignments were originally made in full knowledge that they did not (and could not) be used to untangle the mutual stacking and placement rules for harakat or other kinds of marks. The positioning of vowels and other marks for Arabic and Hebrew was assumed to be "fixed" by the layout rules of the script, which had to be implemented by rendering engines. The canonical combining classes themselves certainly do not force particular placement of marks with respect to each other. > Should we try to remedy this? It *cannot* be remedied. > Is there any software that uses the mark classes directly? Yes. Their *only* significant function is in the Canonical Ordering portion of the Unicode Normalization Algorithm. And because of the stability guarantees for normalization, no canonical combining class assignment for any character can be changed, once assigned: http://www.unicode.org/policies/stability_policy.html#Normalization > Let's look at the position of all the marks irrespective for > their current Unicode mark class. [ Snipping the following interesting discussion about actual placement of marks in Arabic, which people may wish to comment on separately. ] --Ken

