On 6/20/2012 3:22 PM, Karl Williamson wrote:
All current named sequences appear to be each a single grapheme. That
seems like it should always be the case.
Possibly, but keep in mind that neither the Unicode Standard nor UAX #29
in particular
define what a "grapheme" is. UAX #29 specifies an algorithm for determining
boundaries between "grapheme clusters", but it can be tailored, and as a
result
what the "thing" is between such boundaries is a little fuzzy. And even
the default
for that algorithm can and does change.
Furthermore, I don't see any necessary correlation between what sequences
people might end up insisting on naming (for whatever reason) and what
people might consider to be "graphemes". There could be a valid reason
somebody might want or need to name some sequence that clearly wouldn't
constitute a grapheme. Who can predict?
If I'm right, should UAX #34 say this.
That seems like a straitjacket looking for an unwilling wearer. ;-)
--Ken