From: "Antoine Leca" <[EMAIL PROTECTED]> > On Thursday, April 15, 2004 8:16 PM, Philippe Verdy va escriure: > > I thought it was already answered in this list by a Catalan speaking > > contributor: the sequence L+middle-dot in Catalan is NOT a combining > > sequence. > > No? Then was is it? Looks like very much one, to me.
It is more exactly a ligature, not a combining sequence. But the second character of the ligature works more like a diacritic, and not as a separate punctuation or symbol. In some future, we could see U+013F and U+0140 used more often than L or l plus U+00B7... Notably in word processors that can detect these sequences in Catalan text and substitute them with the ligatures, which create a more acceptable letter form and allows easier text handling for (e.g.) word selection in user interfaces and dictionnary lookups. The fact that there's no such L-middle-dot on keyboards should not be a limit: word processors have more key bindings and more intelligence than the default keys found on keyboards. When I see a Catalan word coded with <L, U+00B7, L> it looks very ugly (notably with monospaced fonts or in Teletext) and I'm sure that Catalan readers don't like the default presentation. They will much appreciate the support for the ligated <U+013F or U+0140, L> encodings. I don't think they can be considered "compatibility characters" just introduced for compatibility with a past ISO standard for Videotex and Telelext. The compatibility decompositions in the UCD are bad suggestions (only fallbacks) which create problems that did not exist in the Videotex standard (they already create a problem for internationalized domain names). But now that decomposition are normative, there's no way to change it in Unicode. The only safe way to change things would then be to have a middle-dot diacritic (combining but with combining class 0) to be used instead of U+00B7, even if there's no canonical equivalence with the U+013F and U+0140 ligatures... A Catalan keyboard would then return this new dot instead of U+00B7, and word processors or input method editors would easily find a way to represent it using the ligature when it follows a L. If such character was added, I would give it the general category "Mn", a combining class 0, to match linguistic expectations, and it would work with IRI and IDN as well, and would immediately work with all basic Unicode text processing without needing an exception for Catalan. This new character could have a compatibility decomposition into U+00B7 only as a fallback; and the existing ligatures U+013F and U+0140 could be commented by providing a better decomposition with this new character, than the compatibility decompositions with U+00B7.

