From: "Peter Kirk" <[EMAIL PROTECTED]> > So the logical order is > <shin, sin/shin dot, dagesh, vowel, meteg>. > But the canonical order is > <shin, vowel, dagesh, meteg, sin/shin dot>; > up to three (and in theory > more, at least in biblical Hebrew) other characters may appear between > the base letter and the dot which fundamentally modifies it.
Ohh, I forgot the case of the dagesh consonnant modifier. But why would you like to encode the meteg before the vowel that it modifies? Couldn't it be encoded locally as well after that vowel like: 1) The consonnant group: <shin or other base consonnant>, <sin/shin dot>, <dagesh> 2) The first vowel group: <vowel>, <meteg or other accents>. >From a Hebrew reader perspective, this logical order makes sense, as it consistently groups the letters in the order they are effectively modified: - One reads first the <shin or other base consonnant> - Then alters it into a sin letter with <shin or other base consonnant>,<sin-shin dot> - Then uses the alternate phonetic by adding the <dagesh> - Then recognizes the first "base" vowel sign - Then alters it according to the added accents You agree with me that using "combining order overrides" must be restricted, so that it won't be abused. The idea of using CGJ to encode them may be counterproductive, but one can simply avoid such abuse by creative such CCO control within each script, here in the Hebrew block, by naming it simply a HEBREW VOWEL GROUP HOLDER, which would have properties similar to other hebrew base consonnants. The same thing may be added in one of the Arabic blocks, and possibly in other scripts like Tibetan, where similar issues may appear, or in extinct rare scripts as an "implied" missing base letter, that would help fixing the combining order. This principle may help solve the ambiguities in all those affected scripts (may be there are similar issues in the Latin script for Vietnamese, which would like to better fit the phonetics of words that may be incorrectly rendered by the currently requited normalization order of multiple accents. Such issue also exists when there's a need to change the visual stacking order of accents on Latin letters (for example if a macron should appear below or above a dieresis). In this case, the CCO control added to the general (Latin/Greek/Cyrillic) script would more likely be named something like ACCENT HOLDER. And why not in Japanese too, if diacritics need to be added on top of Hiragana/Katana letters with voice marks. I see the general idea of CCO control characters as a general problem rather than something specific to each language (like Biblic Hebrew), and I see no reason why it could not be admitted and generalized with its own character property category.

