On Sunday, 29 May 2016 at 13:04:18 UTC, Tobias M wrote:
On Sunday, 29 May 2016 at 12:08:52 UTC, default0 wrote:
I am pretty sure that a single grapheme in unicode does not
correspond to your notion of "character". I am pretty sure
that what you think of as a "character" is officially called
"Grapheme Cluster" not "Grapheme".
Grapheme is a linguistic term. AFAIUI, a grapheme cluster is a
cluster of codepoints representing a grapheme. It's called
"cluster" in the unicode spec, because there there is no
dedicated grapheme unit.
I put "character" into quotes, because the term is not really
well defined. I just used it for a short and pregnant answer.
I'm sure there's a better/more correct definition of
graphem/phoneme, but it's probably also much longer and
complicated.
Which is why we need to agree on a terminology, i.e. be clear
when we use linguistic terms and when we use Unicode specific
terminology.