"Dr. David Alan Gilbert" <[EMAIL PROTECTED]> wrote: > * Jim Meyering ([EMAIL PROTECTED]) wrote: >> [EMAIL PROTECTED] wrote: >> > ??? I just used sort on a redhat Enterprise 5 server. >> > >> > ??? Sort seems to ignore leading "." characters.? This is incorrect. >> >> How sort works depends on your locale. >> This link explains and tells you how to change that behavior: >> >> http://www.gnu.org/software/coreutils/faq/coreutils-faq.html#Sort-does-not-sort-in-normal-order_0021 > > That explanation is somewhat unclear whether it's due to an unexpected > behaviour of the locale or the locale tables actually being broken. > I tried to read bits of the Unicode spec last time I hit this > and came away not being entirely sure whether it was actually > valid behaviour. > > If someone could point to something which says 'you should sort > these non-alphanumeric characters like this' and the Linux one > doesn't then perhaps someone will fix it.
The problem is with expectations. People are not used to sort ignoring non-alphanumerics, yet with certain locales, it does, and that is normal and required behavior. Here's an example. In the en_US locale, the leading bytes are ignored, because the locale tables (as required by standards) define the collating sequences that way: $ printf '%s\n' _a .b ,c /d -e \:f| LC_ALL=en_US sort _a .b ,c /d -e :f When collating with the C locale, those bytes *are* used: $ printf '%s\n' _a .b ,c /d -e \:f| LC_COLLATE=C sort ,c -e .b /d :f _a _______________________________________________ Bug-coreutils mailing list [email protected] http://lists.gnu.org/mailman/listinfo/bug-coreutils
