Stewart Gordon Wrote:

> I is the uppercase form of ı.
> İ is the uppercase form of i.
> 
> http://www.unicode.org/Public/UNIDATA/UnicodeData.txt
> lists them as
> 0049;LATIN CAPITAL LETTER I;Lu;0;L;;;;;N;;;;0069;
> 0069;LATIN SMALL LETTER I;Ll;0;L;;;;;N;;;0049;;0049
> 0130;LATIN CAPITAL LETTER I WITH DOT ABOVE;Lu;0;L;0049 0307;;;;N;LATIN 
> CAPITAL LETTER I DOT;;;0069;
> 0131;LATIN SMALL LETTER DOTLESS I;Ll;0;L;;;;;N;;;0049;;0049
> 
> but this is inadequate: while it tells you how to case-convert ı and İ 
> (that's what the 0049 and 0069 at the end are), you need to add a 
> locale-specific rule to all this to convert I and i in Turkish.

I think there should be three i's to solve problems like being able to 
capitalize strings that contain words from two languages as in e.g. an 
imaginary company name "Ali & Jim". The two lowercase i's should have been 
separate to be able to work with them correctly. The problem stems from 
Unicode...

A group of us are about to start a small project that involves thin wrappers 
around Phobos to favor the Turkish behavior for character and string 
processing. That should help with applications that are happy to use Turkish 
only. More complex applications could use libraries like IBM's ICU.

Ali

Reply via email to