If there is a character set that sports both, it must be used to put down some human language. My point there is no language that could make use of this distinction by having both ü and &utrema;. There are languages that use ü and theoretically there could be ones that use &utrema;, although I do not know of any valid case (I consider the French case invalid).
Chëërs Chrïs _____ From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Sander Sent: Saturday, June 23, 2007 2:59 PM To: Kristof Zelechovski; [EMAIL PROTECTED] Subject: Re: [whatwg] Entity parsing I hadn't thought of that one ;-) (in Dutch there are no native words with umlauts, only some of German or Scandinavian descent). My question was about char-sets that contain both a trema version and a (seperate) umlaut version of the same character. Are there any? cheers, Sander Kristof Zelechovski schreef: Only the vowel U can have either but I have not seen a valid example of &utrema;. The orthography "ambigüe" has recently been changed to "ambiguë" for consistency. Polish "nauka" (science) and German "beurteilen" would make good candidates but the national rules of orthography do not allow this distinction because Slavic languages do not have diphthongs except in borrowed words and it would cause ambiguity in German (cf. "geübt"). (Incidentally, this leads to bad pronunciation often encountered even in Polish media.) Cheers Chris -----Original Message----- From: Sander [mailto:[EMAIL PROTECTED] Sent: Friday, June 22, 2007 9:26 PM To: Kristof Zelechovski Subject: Re: [whatwg] Entity parsing Kristof Zelechovski schreef: A dieresis is not an umlaut so I have to bite my tongue each time I write or read nonsense like ï. It feels like lying. Umlaut means "mixed", a dieresis means "standalone". Those are very different things, and "I" can never gets mixed so there is no ambiguïty. Since "umlaut" is borrowed from German, I can see no problem in borrowing "tréma" from French. I personally prefer "&itrema;" to "&idier;" because of readability, but I would not insist on that. "In professional typography, umlaut dots are usually a bit closer to the letter's body than the dots of the trema. In handwriting, however, no distinction is visible between the two. This is also true for most computer fonts and encodings." [http://en.wikipedia.org/wiki/Umlaut_(diacritic)] Are there any char-sets that have both umlaut and trema variations of characters? If so, both entities could exist. cheers, Sander PS: I'd go for "&itrema;" instead of "&idier;" as well as the term "trema" is also the one that's used in Dutch.
