From: "John Hudson" <[EMAIL PROTECTED]> > Philippe Verdy wrote: > > >>>The problem with <HOLAM, VAV> is that it may follow (in the encoded > >>>sequence) some other grapheme cluster terminated by other cantillation > >>>marks. So for me the best candidate would be: <CGJ, HOLAM, VAV>... > > Would ZWNJ perform the same function? > > If the intent is that the holam be associated with the vav rather than the preceding > letter, it seems to me that a control character that does not suggest joining or combining > with the preceding letter would be tidier. I realise that from a processing perspective it > might be irrelevant, but it would be nice if the names of these control characters still > suggested something about their use.
Yes, but the two options need to be coinsidered with the possible caveats with existing implementations. I don't know which is better for collation purpose (I was said that CGJ should never be rendered, but just used to control and avoid canonical reordering, which is why I proposed it: it is not really part of the sequence, but is just inserted to avoid the normalization caveat, so a renderer whould just skip over it after normalization, and a renderer that performs normalization first could then process the string assuming a consistent order of sequences, without having to consider the case of CGJ). Also ZWNJ suggests a break which may cause caveats as holam male is expected to occur in the middle or at end of a word, and any attempt to isolate it from the beginning of the word would be disastrous. Is ZWNJ creating a break opportunity? I need to recheck its status in the existing Unicode reference then.

