On Thursday, July 10, 2003 8:37 PM, Kenneth Whistler <[EMAIL PROTECTED]> wrote:
> Peter Kirk asked: > > > > In Turkish and Azeri the sequences f - i and f - dotless i both > > > occur, and are fairly frequent. So it is inappropriate in these > > > languages to use fi ligatures in which the dot on the i is lost > > > or invisible, at least where the second character is a dotted i. > > > Has any thought been given to this issue? Is it possible to block > > > such ligation on a language-dependent basis? > > > > and Philippe Verdy responded with another question: > > > Isn't there a "Grapheme Disjoiner" format control character to > > force the absence of a ligature like <fi>, i.e. <f, GDJ, i>? > > The answer to Philippe's rejoinder question is no, there is not > a "Grapheme Disjoiner" format control character. I did not refer to a specific unicode character, I knew that there is one already dedicated, but I did not want to comment about this choice. There's no contractiction. The Grapheme Disjoiner, for you is ZWNJ. OK. And I did not want to promote any change in any legally and lecacy encoded text, only to suggest ways to solve the apparent rendering problem in Turkish, when the <f, i> encoded character pair may be badly rendered. For the actual rendering, selecting a <fi> ligature is not appropriate for Turkish, and in fact the canonically decomposed character has no linguistic ambiguity in Turkish. So what ever the <fi> encoded codepoint designates, it is not the <fi> ligature glyoh but really two characters, whose ligation may still be performed according to language context. A font that would automatically select a <fi> ligature to represent a sequence of <f, i> codepoints, from the fact that the <fi> codepoint is canonically equivalent is probably defective and not conforming. Such selection of ligature must be put under the control of the renderer with additional markup, which can in fact select among three ligatures in Turkish: the <fi> ligature glyph where the f is ligated with the dot above i (normal ligature for languages other than Turkish/Azeri, the <f-dotted-i> and <f-fotted-i> ligatures for Turkish/Azeri. Markup is necessary to select the appropriate glyph, or this can be selected by using the "Grapheme Disjoiner" (ZWNJ) or the "Grapheme Joiner" (ZWJ) in addition to the use of a <i> or <dotless-i> codepoint eventually followed by the <i-above> diacritic. All this enrichment of text is assumed to be under the control of the markup added to the original text which does not need to specify whever ligatures should or should not be used.

