Marcin 'Qrczak' Kowalczyk wrote:
W liÅcie z piÄ, 23-07-2004, godz. 18:01 +0200, Philipp Reichmuth napisaÅ:
However, to return to the original problem, I don't remember ever having seen a data where it would be necessary to distinguish between trema and diaeresis in the data itself.
A similar issue: a Polish encyclopaedia I have from 1985 sorts words with à differently depending on whether this is Polish à (sorted between O and P, like other Polish letters are after letters without accents) or foreign à (folded with O, like other foreign accents are folded). It's typeset in the same way.
MOQUETTE MÃR [mo:r], city in Hungary MORA MÃRA [mo:ro] Ferenc, Hungarian writer MORACZEWSKA [...] MOÅNOWÅADZTWO MÃR (a Polish word) [...] MÃÅDÅEK (a Polish word) MPHAHLELE
The context is somewhat different in these two cases though: in the case of Umlaut vs. TrÃma, the distinction is between two different well-defined functions of the same diacritic that traditional German scholarship is aware of (if by no other reason, at least because of the influence of the rather significant body of Classical Greek scholarship that Germany produced), even if the use of one them is foreign to lexical items of native German vocabulary.
In the case of à in Polish, there is the native function (using à to write an U that is etymologically connected with an O, if I'm not mistaken) on one side, and there are all the non-native functions (Hungarian à denoting a long O, Spanish à denoting an accented O - this may be the case with Portuguese à as well, but I'm not sure -, and then there's the Icelandic and and Irish Gaelic Ã, which may have a fourth and a fifth function), all grouped together on the other side.
Although I'm not aware of sorting native and non-native à any differently in Hungarian encyclopedias, but it may happen in one or two of the other languages I listed (or in yet others I'm not aware of). I'm rather prone to think that using e.g. a plain COMBINING ACUTE vs. CGJ + COMBINING ACUTE is a dangerous way of approaching problems like Polish vs. non-Polish Ã. The relevant point here seems to be the language the word is in (I understand Unicode also has standard language markers defined in its inventory).
Regards,
bushmanush
____________________________________________________________________ Miert fizetsz az internetert? Korlatlan, ingyenes internet hozzaferes a FreeStarttol. Probald ki most! http://www.freestart.hu

