On Thu, Sep 15 2016 at 21:27 CEST, e...@gnu.org writes: [...]
> Isn't "grapheme cluster" the definition you are looking for? I don't think so. On Thu, Sep 15 2016 at 21:27 CEST, leobo...@namakajiri.net writes: > Isn't the Swift "character" and the "textel" merely the same thing as > what Unicode already named "grapheme clusters"? (Well, technically UAX > #29[1] defines them as "user-perceived characters", but then says > grapheme clusters approximate user-perceived characters > algorithmically). > > And, indeed, Swift "Characters" are explicitly defined as "extended > grapheme clusters" (also from UAX #29): > > https://developer.apple.com/library/content/documentation/Swift/Conceptual/Swift_Programming_Language/StringsAndCharacters.html > > Such a notion is indeed needed, but it has been always there. > > [1] http://unicode.org/reports/tr29/ Perhaps I don't understand properly the rather obscure definitions, like An extended grapheme cluster is the same as a legacy grapheme cluster, with the addition of some other characters. However: 1. Graphemes, if I understand correctly, are language dependent, textels are not. 2. Textel "ń" means both U+0144 and <U+006E,U+0301>, so it is a notion on a higher abstraction level then a grapheme cluster. Moreover I don't want to call <U+006E,U+0301> (LATIN SMALL LETTER N, COMBINING ACUTE ACCENT) an extended grapheme cluster for at least 2 reasons: 1. there is nothing extended in it 2. U+0301 is not a grapheme according to Polish linguistics terminology Regards Janusz -- , Prof. dr hab. Janusz S. Bien - Uniwersytet Warszawski (Katedra Lingwistyki Formalnej) Prof. Janusz S. Bien - University of Warsaw (Formal Linguistics Department) jsb...@uw.edu.pl, jsb...@mimuw.edu.pl, http://fleksem.klf.uw.edu.pl/~jsbien/