2013/1/24 Richard Wordingham <[email protected]>: > On Wed, 23 Jan 2013 23:46:33 +0100 > Philippe Verdy <[email protected]> wrote: > >> For this reason Turkic >> texts *should* encode the hard-dotted lower case i as i+dot above, and >> not just as i alone. But when the language used in the text is clear, >> the extra encoding of the explicit "hard" dot above is almost always >> forgotten and for legacy reasons, most Turkic texts do not use this >> extra dot above, but it does not mean that its presence is incorrect >> (it will be needed in multilingual documents, or when using some >> Medieval-style fonts that do NOT display any dot above U+0069 and >> U+006A and that require the explicit U+0307 to render the hard dot >> needed for Turkish). > > If text is going to be processed, i+dot is wrong for Turkish, as the > Unicode casing rules for Turkish will capitalise it to I+dot+dot, which > should display with two dots. If you're going to use an explicit dot, > I'd have said <U+0131, U+0307> would be better, though I still think > using an explicit dot is wrong in general.
Probably yes, the ASCII i/I should be avoided in all cases in Turkish, prefering the dotless i/I every time, with or without the extra dot above. But the legacy use of the ASCII i/I is still prevalent everywhere (notably for those that used the legacy 8-bit encodings that did NOT have the combining dot above). My opinion is that capitalizing the ASCII i followed by a combining dot above should NEVER produce two dots (it is a limitation of the current simple case mappings, even when using the Turkish rules). A correct capitalization for Turkish should just produce a single dot, by mapping not just characters per character but by working at the grapheme level.

