On 18.10.2011 23:57, Ian MacArthur wrote:
> Questions that might be relevant are...
>
> What does strcasecmp() make of the ("i", "ı") or ("I", "İ") cases?
> Presumably they are declared as non-matches too?
> So it is not the case that is at issue here, but the fact that strcasecmp()
> thinks that dotted/non-dotted I letters are distinct?
>
> Are they generally considered as distinct? What happens when a text that is
> in a non-Turkish Latin script is parsed, by a Turkish system?
> In that case, it might be correct to parse i and I as equivalent (that's
> dotted-small-i and non-dotted-caps-I) since they probably are equivalent in
> the source language.
> This is (I think) the use-case Corvid is thinking about...
>
> Are there parallels in other languages that are pertinent?
> How are O and Ö handled (or U and Ü I guess) in languages that use them?
> Are they sorted as "the same" or as different letters?
In german, an Umlaut (äöü, ÄÖÜ) is a different letter than the
normal character without the dots (aou, AOU).
WRT sorting: there are two different sorting possibilities:
(1) In a well-known dictionary (Duden), a and ä seems to be
considered equivalent for sorting, e.g.
Mahraun < Mähre < Mähren < Mahrenholz.
(2) In telephone books an Umlaut is considered equivalent(?)[1]
to the normal vocal + "e" (ä == äe, Ü == UE, etc.), e.g.
Modul < Möbel < Möller <= Moeller < Mof... .
[1] In fact, I can't even tell if "oe" == "ö" or if there is an order,
if everything else is equal.
Neat, isn't it?
Albrecht
_______________________________________________
fltk mailing list
[email protected]
http://lists.easysw.com/mailman/listinfo/fltk