Philippe Verdy va escriure: > > Space is a base character, then it combines with the next diacritic > with which it creates a "default grapheme cluster" which should be > interpreted as if it was a single character identity.
Agreed so far for diacritics. Agreed also for non-spacing dependent vowels like U+0BC0. Agreed for the special exceptions like u+0BBE. I disagree for U+093F or U+0BBF (Mc not included in Other_Grapheme_Extend, there is an allowed break before it), until there is something I missed here. > It is NOT defective. I do not understand. I did say anything implying that, did I? I just remarked that I was not able to fetch in the text of the standard some words to require from vendors and implementers (like I am) solid base to make them modify their engines to provide special exceptions to deal with the combination U+0020/U+00A0 then U+093F. And no, this is not the same as displaying a diacritic, because it should be re-ordered, rather than being a "spacing representation of diacritics". > Now how would you interpret differently SPACE+diacritic or > SPACE+vowel sign? See above. > If you display a dotted circle there, then you'll > display two separate glyphs for a single grapheme cluster, and this > is not intended by the normal Unicode character model. ? How do you believe anybody will show say u+0063 u+0300? Which font have this as a single glyph? Furthermore, a single character like U+0916 (Devanagari KHA) is very often rendered with two glyphs (namely, Half-Kha then the glyph also used for the AA-matra, U+093E). Unicode does not enter into knowing how does this stuff is handled. Antoine

