If python can have pyuca that works across platform, why such thing can not have at C level?
On Wed, Sep 25, 2019 at 12:24 PM Eric Blake <[email protected]> wrote: > On 9/25/19 10:56 AM, Peng Yu wrote: > > I want to make my `sort` to be machine-independent and always use the > > correct Unicode sort order. Is there a way to do so? > > Those two goals are somewhat at odds. The only truly portable > machine-independent sorting is the one guaranteed by POSIX when you use > LC_ALL=C (fun fact: even on an EBCDIC machine, that is required by POSIX > to collate in ASCII order, rather than native byte order). The moment > you use any other locale, then you not only left to the mercies of > whoever wrote that locale, but also stuck with the fact that there is no > portable way to transfer locale definitions from one vendor's libc to > another. > > > > > I don't know how to check where en_US.UTF-8 comes from. Do you know > > how to check it? (I use Mac OS X.) > > All other locales are somewhat vendor-dependent; as you've discovered, > your vendor (Apple) has a rather gaping hole in their locale support. > But because Apple is a closed-source shop, it will have to be Apple that > fixes their bug, unless you want to take on the gargantuan task of > writing a gnulib module that provides locale tables to mirror glibc for > use on non-glibc machines. > > Note that glibc doesn't have that problem, at least on my system: > > $ cat /etc/fedora-release > Fedora release 30 (Thirty) > $ rpm -q glibc > glibc-2.29-22.fc30.x86_64 > $ printf '%s\n' cafe caff café | LC_ALL=en_US.UTF-8 sort --debug > sort: text ordering performed using ‘en_US.UTF-8’ sorting rules > cafe > ____ > café > ____ > caff > ____ > > So one option you could pursue is switching to an operating system that > does not curtail your freedoms. > > -- > Eric Blake, Principal Software Engineer > Red Hat, Inc. +1-919-301-3226 > Virtualization: qemu.org | libvirt.org > -- Regards, Peng
