On 27/10/2003 07:28, Philippe Verdy wrote:

From: "Peter Kirk" <[EMAIL PROTECTED]>



So the logical order is
<shin, sin/shin dot, dagesh, vowel, meteg>.
But the canonical order is
<shin, vowel, dagesh, meteg, sin/shin dot>;
up to three (and in theory
more, at least in biblical Hebrew) other characters may appear between
the base letter and the dot which fundamentally modifies it.



Ohh, I forgot the case of the dagesh consonnant modifier.


But why would you like to encode the meteg before the vowel that it
modifies? Couldn't it be encoded locally as well after that vowel like:
1) The consonnant group: <shin or other base consonnant>, <sin/shin dot>,
<dagesh>
2) The first vowel group: <vowel>, <meteg or other accents>.


The issue with meteg is that it can occur to the left of (commonest), to the right of and in the middle of the vowel. See http://www.qaya.org/academic/hebrew/Issues-Hebrew-Unicode.html sections 3.4 and 3.5. These three positions of meteg are essentially the same character but the three positions have subtly different meanings. It is effectively an independent issue from vowels and sin/shin dot.

From a Hebrew reader perspective, this logical order makes sense, as it
consistently groups the letters in the order they are effectively modified:
- One reads first the <shin or other base consonnant>
- Then alters it into a sin letter with <shin or other base
consonnant>,<sin-shin dot>
- Then uses the alternate phonetic by adding the <dagesh>
- Then recognizes the first "base" vowel sign
- Then alters it according to the added accents

You agree with me that using "combining order overrides" must be restricted,
so that it won't be abused. The idea of using CGJ to encode them may be
counterproductive, but one can simply avoid such abuse by creative such CCO
control within each script, ... I see the general idea of CCO control characters as a general problem rather
than something specific to each language (like Biblic Hebrew), and I see no
reason why it could not be admitted and generalized with its own character
property category.




I don't see any difference between your proposed generic CCO and CGJ. As you say, the same function may be needed in several scripts, including perhaps IPA which uses complex diacritic stacking. So why not simply use CGJ?

--
Peter Kirk
[EMAIL PROTECTED] (personal)
[EMAIL PROTECTED] (work)
http://www.qaya.org/





Reply via email to