Philippe Verdy va escriure:
>
> Space is a base character, then it combines with the next diacritic
> with which it creates a "default grapheme cluster" which should be
> interpreted as if it was a single character identity.

Agreed so far for diacritics. Agreed also for non-spacing dependent vowels
like U+0BC0. Agreed for the special exceptions like u+0BBE. I disagree for
U+093F or U+0BBF (Mc not included in Other_Grapheme_Extend, there is an
allowed break before it), until there is something I missed here.

> It is NOT defective.

I do not understand. I did say anything implying that, did I? I just
remarked that I was not able to fetch in the text of the standard some words
to require from vendors and implementers (like I am) solid base to make them
modify their engines to provide special exceptions to deal with the
combination U+0020/U+00A0 then U+093F.

And no, this is not the same as displaying a diacritic, because it should be
re-ordered, rather than being a "spacing representation of diacritics".


> Now how would you interpret differently SPACE+diacritic or
> SPACE+vowel sign?

See above.

> If you display a dotted circle there, then you'll
> display two separate glyphs for a single grapheme cluster, and this
> is not intended by the normal Unicode character model.

?

How do you believe anybody will show say u+0063 u+0300? Which font have this
as a single glyph?

Furthermore, a single character like U+0916 (Devanagari KHA) is very often
rendered with two glyphs (namely, Half-Kha then the glyph also used for the
AA-matra, U+093E). Unicode does not enter into knowing how does this stuff
is handled.


Antoine


Reply via email to