On Friday, June 27, 2003 3:23 PM, Karlj�rgen Feuerherm <[EMAIL PROTECTED]> wrote:
> > At 04:22 -0500 2003-06-27, [EMAIL PROTECTED] wrote:
> Now, Q: I take it the combining classes are linked to the script,
> rather than say to a dialect--e.g. one can't define BH as a separate
> dialect from MH with its own set of rules? (I assume this is the case
> because otherwise someone would have proposed it already.)
> 
> I REALLY think that option 1 should be beaten to death with a stick,
> then beaten to death again, before settling for one of the others.
> 
> Hoping this didn't sound like a pointless diatribe but rather that
> taking a step back from the details might help?

Do you then propose to create a specific character, for use within the Hebrew script 
only, as a way to specify an alternate order for hebrew cantillation? In that case, it 
would be more appropriate to define new standard variants of these cantillation marks, 
and list them in the supported variants, to be used specially for Biblic Hebrew.

The rule for their use must be however simple: the variant selector must be made legal 
before any cantillation mark, even if it is not strictly necessary (for example 
between a base Hebrew character and a Hebrew point, or between two hebrew points whose 
normalization combining order is not defective).

This would allow writing a simple transcoding algorithm for the existing encoded texts 
(using only the ISO10646 encoding rules), and allow further optimizations of the 
transformed text, to remove Variant selectors when they are not strictly necessary.

This way, we won't override the semantic of the existing ZWJ or CGJ characters that 
were initially created to be used only before a base character to join combining 
sequences in the renderer or to disallow a candidate break. The breaking algorithms 
are already complex enough to avoid adding special semantics to these characters.

On the opposite, variant selectors are much cleaner, and the extra optimization for 
their superfluous use, can be added to UAX#15, simply because Variant selectors are 
only legal (and thus stable) for the predefined sequences.

Variant selectors do not break the stability pact, because under this pact, a <VS, 
character> sequence is considered (for XML and other related standards) as distinct 
from the isolated character without the variant selector, and thus can have distinct 
character properties.

This also has the adantage that there is absolutely no need to recode all the existing 
documents written with modern Hebrew, and the problem can be isolated to just the few 
already encoded historic documents.

-- Philippe.


Reply via email to