On Friday, April 16, 2004 12:37 PM, Philippe Verdy va escriure: > In some future, we could see U+013F and U+0140 used more often than L > or l plus U+00B7...
I (personally) hope we would not. > Notably in word processors that can detect these > sequences in Catalan text and substitute them with the ligatures, > which create a more acceptable letter form and allows easier text > handling for (e.g.) word selection in user interfaces and dictionnary > lookups. As I wrote earlier, if you know the text under inspection is Catalan, a very simple regular expression will deal with that. Any half-decent Catalan word processor do it already, by the way. > The fact that there's no such L-middle-dot on keyboards should not be > a limit: word processors have more key bindings and more intelligence > than the default keys found on keyboards. Yes yes yes. Particularly when I want to insert afterwards a � between two ll, when it appears I missed it on the first shot (yes, it happens). Or when I want to remove a superfluous one that I typed by mistake (yes, it happens too). With your "intelligence", this latter point will prove being a headache: on the first shot, a normal user will place the caret just after the dot, and press Rubout. Slurp, the whole U+0140 is swallowed, but usually the user will not notice it. So at the second sight (perhaps a lot of time after, perhaps after an useless additional printout), she will have to type in the first l. Intelligent keyboards might be great. But to be so, they have to bring *much* added value (like, obviously, to be able to type in a language impossible otherwise; or, more simply, to avoid typing every five minutes Alt+0156). If they bring only very little value, they are more annoying that anything else, particularly when they are non permanent but rather operate from time to time. This would be the case here: as Catalan writer, I type about texts sometimes in the word processor, where I would be "helped". And sometimes in the mail reader, or on the console, where I would not, for example because I do not want to wait two full minutes for the whole "helpers" to come in everytime I have to type the name of the user of a given process... > When I see a Catalan word coded with <L, U+00B7, L> it looks very > ugly (notably with monospaced fonts or in Teletext) and I'm sure that > Catalan readers don't like the default presentation. Yes it looks ugly. But this is in fact less ugly for me than seeing l.l or l-l. Ugliness is in the eye of the beholder, of course. When you are in the habit of seeing about every hour some rendering of l�l, you will not notice it. And in fact, I notice more when someone use the kerned version advocated by Gabriel Valiente, because nowadays it is unusual. And I certainly would not use the kerned version for some institutional version, because I do not want to incommodate my readers (this problem showed up about 20 days ago between us; and there were no debate). > They will much > appreciate the support for the ligated <U+013F or U+0140, L> > encodings. What do you prefer? El col�legi Miguel Hern�ndez de Riola? El co[]legi Miguel Hern�ndez de Riola? ([] is ASCII art for a box, which is how many many people would see any use of U+013F...) > I don't think they can be considered "compatibility > characters" just introduced for compatibility with a past ISO > standard for Videotex and Telelext. Sorry, you are fighting a lost battle: everyone here do not use them, so all the corpus is already encoded without them. The mills of Don Quixote are in Mota del Cuervo, it is only about 200 km from here, but this is not the Catalan-speaking region ;-). > The only safe way to change things would then be to have a middle-dot > diacritic (combining but with combining class 0) to be used instead > of U+00B7, even if there's no canonical equivalence with the U+013F > and U+0140 ligatures... A Catalan keyboard would then return this new > dot instead of U+00B7, and word processors or input method editors > would easily find a way to represent it using the ligature when it > follows a L. [snip] May I suggest U+1000B7 for this new character? Antoine

