On Feb 4, 2009, at 6:54 AM, Uwe Fischer wrote:
John W Kennedy wrote:
On Feb 3, 2009, at 9:47 AM, Uwe Fischer wrote:
For example, in the German spelling bible called "Duden", the
umlauts Ä. Ö, Ü are sorted as if they are just plain A, O, U
characters.
Really? I worked for American Hoechst and its successors for 29
years, and I was always instructed to collate them as equal to AE
OE UE (and ß as equal to ss, of course).
The AE, OE and UE are replacements for Ä, Ö and Ü in case that the
real characters are not available (for example, using old
typewriters, or the first 7-bit ASCII only computer displays).
Today, with Unicode everywhere, there is no excuse for not using the
Umlaute or all the other special characters.
I am aware of that; ich spreche ein bisschen Schuldeutsch. This was
quite unrelated; I was instructed in technical specifications, in so
many words, to implement a collating rule that substituted AE, OE, UE,
or SS, so that, z.b., "Bär" collated between "Bad" and "baff". This
required no small effort to include in various programs, and to hear
now, a dozen and more years later, that this was wasted effort, well,
davor bin ich ganz wirklich baff.
This replacement rule is not related to the sort order rule. The
German sorting order is given by this string:
ExemplarCharacters{"[a ä b-o ö p-s ß t u ü v-z]"}
found in
http://source.icu-project.org/repos/icu/icu/trunk/source/data/locales/de.txt
For English it is not as complicated:
ExemplarCharacters{"[a-z]"}
--
John W Kennedy
"Information is light. Information, in itself, about anything, is
light."
-- Tom Stoppard. "Night and Day"
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]