Re: Case mapping of dotless lowercase letters

Peter Kirk Wed, 17 Dec 2003 04:21:47 -0800

On 16/12/2003 17:21, Kenneth Whistler wrote:

Correcting myself:

Note that none of the 3 sets of equivalence classes violates *canonical* equivalence, because none of the 8 sequences involved is canonically equivalent to any other. In other words, no matter which of the 3 approaches you take to case folding, in no instance are you claiming that canonically equivalent sequences are to be interpreted differently.
Actually, dotted I *is* canonically equivalent to <I, dot above>
(I overlooked that when compiling the summary.)

This implies (since there are no decomposition exclusions) that NFD, used on Turkic text, violates the very sensible rule DO NOT USE COMBINING DOTS WITH I's, and leads to all sorts of potential confusion e.g. that both simple and full case folding and lowercasing applied to NFD Turkic text generate the nonsensical <i, dot above>. This could be a serious problem - although one that may not be worth fixing.

--
Peter Kirk
[EMAIL PROTECTED] (personal)
[EMAIL PROTECTED] (work)
http://www.qaya.org/

Re: Case mapping of dotless lowercase letters

Reply via email to