On Thu, 20 Sep 2001, Dan Sugalski wrote:
> I've also been told that the problem even exists in Western European 
> languages--some languages consider accented (or umlauted, or tilde'd, or 
> whatever) characters different from the un-accented version, and some 
> don't. And in some cases two different languages will sort the same mix of 
> accented and unaccented characters differently.

In Swedish, the 3 accented letters are successors of "z" in this order:
å, ä, ö.

In German, the 4 accented letters, ä, ö, ü, ß, are
immediate successors of a,o,u,z respectively. Note that there the two
first letters conflict with Swedish's idea of sorting. Also, ß is
often written "ss" (!).

In Esperanto, every of the 6 accented letters is the immediate successor
of its non-accented version. iso-8859-3, Esperanto's codepage, is rarely
used, so û may actually be written "ux" or "^u" or "w" depending on
whichever of the 8 standards you use ("ux" is most common though).

In French, the 12 accent/letter combinations are treated are equivalent to
non-accented letters, so words starting with e and é will be mixed.
It is possible that otherwise orthographically equivalent words may then
be sorted by their accents, but it's not important. To complicate matters,
French has also œ, but it's in declining usage, so most of the time
you just write "oe".

Beyond that... some languages use a mixed scheme, e.g. Estonian (afaik)
puts accented S and Z in-place, but inserts accented letters between W and
X, and move the Z's right after the S. Icelandic is almost as special.

________________________________________________________________
Mathieu Bouchard                   http://hostname.2y.net/~matju



Reply via email to