On 16/12/2003 17:21, Kenneth Whistler wrote:

Correcting myself:



Note that none of the 3 sets of equivalence classes violates
*canonical* equivalence, because none of the 8 sequences involved
is canonically equivalent to any other. In other words, no matter
which of the 3 approaches you take to case folding, in no instance
are you claiming that canonically equivalent sequences are to be
interpreted differently.



Actually, dotted I *is* canonically equivalent to <I, dot above> (I overlooked that when compiling the summary.)



This implies (since there are no decomposition exclusions) that NFD, used on Turkic text, violates the very sensible rule DO NOT USE COMBINING DOTS WITH I's, and leads to all sorts of potential confusion e.g. that both simple and full case folding and lowercasing applied to NFD Turkic text generate the nonsensical <i, dot above>. This could be a serious problem - although one that may not be worth fixing.

--
Peter Kirk
[EMAIL PROTECTED] (personal)
[EMAIL PROTECTED] (work)
http://www.qaya.org/





Reply via email to