Re: Umlaut and =?ISO-8859-1?Q?Tr=E9ma=2C_was=3A_Variation_?= =?ISO-8859-1?Q?sele___ctors_and_vowel_marks?=

Peter Kirk Wed, 14 Jul 2004 17:01:34 -0700

On 14/07/2004 23:10, Kenneth Whistler wrote:

...

Thanks for all the clarification which I have snipped.

One such situation is Holam Male which never takes an additional combining mark*. So why can't we represent it as <VAV, HOLAM, variation selector>?

Because the UTC has ruled out <CM, VAR> as interpretable sequences.

Is there a better reason than "because we say so"? You don't have to answer that one.

After all in practice there is no normalisation problem with this. (By the way, I am proposing as one option <VAV, variation selector, HOLAM>, but that has been opposed on the debatable grounds that what changes is not the VAV but the HOLAM - the best description is that the whole grapheme cluster changes.)

I don't have a quarrel with describing things that way -- but you just can't get from here to there with variation selectors.

I don't quite understand you here. Are you saying that <VAV, variation selector, HOLAM> would be acceptable for representing a variation of the entire grapheme cluster, or that it would not?

The alternatives which we might consider include <VAV, ZW(N)J, HOLAM>. This corresponds closely to Peter Constable's recommendations for Indic languages in http://www.unicode.org/review/pr-37.pdf, which is to use <base, ZWJ, VIRAMA>, and indeed to the existing special-case rule for Bengali RA + ya-phalaa in Figure 12 of that document. Or would we do much better to stick to <ZW(N)J, VAV, HOLAM> or <HOLAM, ZW(N)J, VAV>, keeping ZW(N)J outside the combining sequence?

--
Peter Kirk
[EMAIL PROTECTED] (personal)
[EMAIL PROTECTED] (work)
http://www.qaya.org/

Re: Umlaut and =?ISO-8859-1?Q?Tr=E9ma=2C_was=3A_Variation_?= =?ISO-8859-1?Q?sele___ctors_and_vowel_marks?=

Reply via email to