Kaixo!
On Wed, Aug 29, 2001 at 06:25:38PM -0400, [EMAIL PROTECTED] wrote:
> Markus,
> Thanks for your help with the German example.
> But I don't understand your Swedish example here.
> I think in en_US.utf8, "�" (0xC3,0xA4) is after "z".
No.
en_US uses the default sorting, which treats all diacriticized letters
the same as the base letters, and ligatures (eg: ae, oe) as the two
composing letters.
I don't know right now about special letters like eng, eth or thorn.
> do you mean, in sv_SE.utf8, "�" is also after "z", ?
No, *only* sv_SE (and other scandinavian languages) have that sorting.
"C" locale too, but by accident. in fact "C" locale don't have any sorting
policy at all, it jsut sort byte values, not letters.
> then how can we compare ?
> I think strcoll("z", "�") should returns -1 in both locale settings.
> Thanks.
test:~# LC_ALL=en_US.UTF-8 bash -c 'echo -e "�\nz" | sort'
�
z
test:~# LC_ALL=sv_SE.UTF-8 bash -c 'echo -e "�\nz" | sort'
z
�
BTW, why exactly do you do those tests? Maybe if you ask what you actually
want to know somebody could answer you.
--
Ki �a vos v�ye b�n,
Pablo Saratxaga
http://www.srtxg.easynet.be/ PGP Key available, key ID: 0x8F0E4975
-
Linux-UTF8: i18n of Linux on all levels
Archive: http://mail.nl.linux.org/linux-utf8/