Dalley Mark (South West Commissioning Support) <mark.dal...@swcsu.nhs.uk>:
> I think the key phrase is "user-perceived". And you don't need to involve
> complex scripts either.
> For instance as an English-speaking person, I would perceive the "æ" in
> "encyclopædia" as being two characters (albeit shoved together somewhat). The
> argument for this is that the word can equally well be rendered as
are all legal spellings of the same word in a writing system, a useful
linguistic definition of grapheme should ensure that all three variants have
the same number of graphemes.
Although linguists often prefer minimal pair analysis, there are some rules of
thumb for what is a grapheme:
- … whatever goes into a single box in a crossword puzzle.
- … whatever gets transposed if you reverse a word or generate an anagram.
- … whatever gets capitalized together in the beginning of a word.
(Some argue that capitalization operates on characters, not graphemes,
- … whatever can never be split up by hyphenation.