Kent Karlsson wrote: > > > Except in the original context it should have meant "infinite", as > > > there is actually an infinite number of potential default grapheme > > > clusters. > > > > How can that be, if there is a finite number of characters that can be > > part of a cluster, and a (presumably) finite upper bound on the number > > of characters in a cluster? > > There is no such finite upper bound. Theoretically, that is. In practice > the size of grapheme clusters will be fairly small. A grapheme cluster > with 9 characters in it, say made of 3 lead Hangul jamos, 3 vowel jamos, > and 3 trail jamos will be among the larger ones you will encounter.
Aren't there even longer default grapheme clusters with pointed Hebrew or Arabic? Or with Indian scripts that use various starter combining characters? > But there is no theoretical bound. I did not say that, and I explained my view on what I meant by saying "nearly infinite", which is of course infinitie in theory with Unicode, but is not when DGCs are what users consider as "characters" in actual written languages, for which there's not an infinite number of possibilities (at least it is limited by the lexical entries of the language itself, which are clearly not infinite) May be the only exceptions are for mathematics which can use arbitrarily complex layouts of diacritics in formulas, or for complex Han ideographs composed with IDCs and radicals... It could be even longer if ever a script is created to encode organic chemical formulas, which have a complex layout of combining sequences. __________________________________________________________________ << ella for Spam Control >> has removed Spam messages and set aside Newsletters for me You can use it too - and it's FREE! http://www.ellaforspam.com
<<attachment: winmail.dat>>

