RE: FAQ entry (was: Looking for information on the UnicodeData file)

2003-03-12 Thread Alan Wood
Christopher John Fynn wrote: Print e.g. oestrogen (where oe represents a single sound), but, e.g., chloro-ethane (not chloroethane) to avoid confusion. Please don't try to apply these rules to chemical nomenclature - there are already enough people who get the hyphens wrong, without

RE: FAQ entry (was: Looking for information on the UnicodeData file)

2003-03-12 Thread jarkko.hietaniemi
The same people consider Latin a dead language, suitable only for study of ancient documents, which is clearly not the view taken at the Vatican, which continues to produce new documents in that language. In recent encyclicals, however, at least as published at www.vatican.va, the æ and oe

RE: FAQ entry (was: Looking for information on the UnicodeData file)

2003-03-12 Thread jarkko.hietaniemi
One can get daily news in Latin, too: http://www.yle.fi/fbc/latini/ Correction: a weekly review.

Re: RE: FAQ entry (was: Looking for information on the UnicodeData file)

2003-03-12 Thread Rick McGowan
One can get daily news in Latin, too: http://www.yle.fi/fbc/latini/ Complete with a very nice recitation in Latin! http://www.yle.fi/fbc/latini/recitatio.html

Re: FAQ entry (was: Looking for information on the UnicodeData file)

2003-03-11 Thread Christopher John Fynn
John Cowan [EMAIL PROTECTED] wrote: Kent Karlsson scripsit: E.g., it is quite legitimate to render, e.g. LIGATURE FI as an f followed by an i, no ligation, whereas that is not allowed for the ae ligature/letter, nor for the oe ligature. How do you know that? Either Caesar or Csar

Re: FAQ entry (was: Looking for information on the UnicodeData file)

2003-03-11 Thread Mark Davis
No. One cannot make such a black and white statement (correctly, at least). The OED does use Csar, for example. While most people would consider it slightly old-fashioned to use that form, it is done. Mark [EMAIL PROTECTED] IBM, MS 50-2/B11, 5600 Cottle Rd, SJ CA 95193 (408) 256-3148

Re: FAQ entry (was: Looking for information on the UnicodeData file)

2003-03-11 Thread Jim Allan
John Cowan posted: How do you know that? Either Caesar or Csar is good Latin. Christopher John Fynn posted in response: No. Hart's Rules: VOWEL-LIGATURES The combinations and should each be printed as two letters in Latin and Greek words, e.g. Aeneid, Aeschylus, Caesar, Oedipus, Phoenicia;

Re: FAQ entry (was: Looking for information on the UnicodeData file)

2003-03-11 Thread Curtis Clark
John Hudson wrote: The same people consider Latin a dead language, suitable only for study of ancient documents, which is clearly not the view taken at the Vatican, which continues to produce new documents in that language. In recent encyclicals, however, at least as published at

Re: FAQ entry (was: Looking for information on the UnicodeData file)

2003-03-07 Thread Pim Blokland
John Cowan schreef: Digraphs and ligatures are both made by combining two glyphs. In a digraph, the glyphs remain separate but are placed close together. In a ligature, the glyphs are fused into a single glyph. Oh, in that case I must say I think the UnicodeData.txt file doesn't do a very

Re: FAQ entry (was: Looking for information on the UnicodeData file)

2003-03-07 Thread John Cowan
Pim Blokland scripsit: For instance, the Danish ae (U+00E6) is not designated a ligature, It was in Unicode 1.0; I think politics were involved in that one. In Latin use, ae is most certainly a ligature, and likewise in the languages (including English) that have borrowed words involving it. In

RE: FAQ entry (was: Looking for information on the UnicodeData file)

2003-03-07 Thread Kent Karlsson
The names do NOT always provide correct descriptions of the characters. This is especially true for digraph and ligature (and in the case of U+00E6 too), as well as (e.g.) SCRIPT CAPITAL P, which is neither script, nor capital (it's lowercase), though it is a p... In addition, there are

RE: FAQ entry (was: Looking for information on the UnicodeData file)

2003-03-07 Thread Kent Karlsson
For instance, the Danish ae (U+00E6) is not designated a ligature, It was in Unicode 1.0; I think politics were involved in that one. In Latin use, ae is most certainly a ligature, and likewise in the languages (including English) that have borrowed words involving it. In Danish use,

Re: FAQ entry (was: Looking for information on the UnicodeData file)

2003-03-07 Thread John H. Jenkins
On Friday, March 7, 2003, at 04:26 AM, Pim Blokland wrote: Oh, in that case I must say I think the UnicodeData.txt file doesn't do a very good job. For instance, the Danish ae (U+00E6) is not designated a ligature, but the Dutch ij (U+0133) is, even though the a and e are clearly fused

Re: FAQ entry (was: Looking for information on the UnicodeData file)

2003-03-07 Thread Roozbeh Pournader
On Fri, 7 Mar 2003, John H. Jenkins wrote: since different people speaking different languages often have different perceptions of what a symbol is. Reminds me of ISIRI 3342 that officially considered symbol and character the same thing and used one word (namaad, Noon, Meem, Alef, Dal) for

Re: FAQ entry (was: Looking for information on the UnicodeData file)

2003-03-07 Thread Pim Blokland
Kent Karlsson schreef: Typographically, it's a ligature either way. You mean that both ae and ij should be called ligatures, although one is fused and the other isn't? OK, I can live with that. I'd rather the ij were called a digraph, though. The ij is considered by some to be one letter in

Re: FAQ entry (was: Looking for information on the UnicodeData file)

2003-03-07 Thread John Cowan
Kent Karlsson scripsit: E.g., it is quite legitimate to render, e.g. LIGATURE FI as an f followed by an i, no ligation, whereas that is not allowed for the ae ligature/letter, nor for the oe ligature. How do you know that? Either Caesar or Cæsar is good Latin. -- After fixing the Y2K bug

Re: FAQ entry (was: Looking for information on the UnicodeData file)

2003-03-07 Thread John Cowan
Pim Blokland scripsit: The ij is considered by some to be one letter in Dutch, and when written down, an i and a j together look very much like a written y with diaeresis. (See fonts like Script MT.) So I can understand foreigners getting confused and encoding it that way (as a y with

Re: FAQ entry (was: Looking for information on the UnicodeData file)

2003-03-07 Thread Doug Ewell
Michael Everson everson at evertype dot com wrote: You mean that both ae and ij should be called ligatures, although one is fused and the other isn't? OK, I can live with that. I'd rather the ij were called a digraph, though. These terms are not normative. Get used to it. The names

FAQ entry (was: Looking for information on the UnicodeData file)

2003-03-05 Thread John Cowan
I've reformatted Pim Blokland's question as a Unicode FAQ. Q: What do the terms turned, inverted, reversed, rotated, inverse, digraph, and ligature used in the names of Unicode characters mean? A: These terms are basically typographical rather than Unicode-specific. A turned character is one