On 9/25/19 10:20 AM, Peng Yu wrote:
Hi,
It seems that "café" should be sorted before "caff" in Unicode.
https://github.com/jtauber/pyuca
But `sort` does not do so.
$ printf '%s\n' cafe caff café | LC_ALL=UTF8 sort
cafe
caff
café
$ printf '%s\n' cafe caff café | LC_ALL=en_US.UTF-8 sort
cafe
caff
café
How to make `sort` sort according to Unicode order? Thanks.
You'll have to write a locale definition where strcoll() sorts in the
order you want. Coreutils sort is calling strcoll(), and if it doesn't
sort the way you think it should, the bug is in your locale and not in
coreutils. You'll want to report this issue to whoever provided your
en_US.UTF-8 locale (perhaps glibc?)
--
Eric Blake, Principal Software Engineer
Red Hat, Inc. +1-919-301-3226
Virtualization: qemu.org | libvirt.org