Quote/Cytat - Richard Wordingham <richard.wording...@ntlworld.com> (Sun 12 Mar 2017 09:10:22 PM CET):

On Sun, 12 Mar 2017 20:02:28 +0100
"Janusz S. Bien" <jsb...@mimuw.edu.pl> wrote:

If the basic notion has to be referred in a cumbersome way as
"extended grapheme cluster" then it is easier to talk about "Unicode
characters" despite the fact that they have a rather loose relation
to real-life/user-perceived characters.

The notion that extended grapheme clusters corresponds to
user-perceived characters is also rather dodgy.

The idea is not mine, but it appears from time to time on the list in a more or less explicit way.

Whereas it may work
for French, it is getting very dubious by the time one adds Hebrew
cantillation marks or Vedic accentuation.  The Thais revolted when
their preposed vowels were joined with the following consonant in the
same extended grapheme cluster, and Unicode had to revoke that union.

Just yet another reason for introducing the notion of textel?

Best regards

Janusz


--
Prof. dr hab. Janusz S. Bień - Uniwersytet Warszawski (Katedra Lingwistyki Formalnej)
Prof. Janusz S. Bień - University of Warsaw (Formal Linguistics Department)
jsb...@uw.edu.pl, jsb...@mimuw.edu.pl, http://fleksem.klf.uw.edu.pl/~jsbien/

Reply via email to