Hi there,
I am sending this to you directly in addition to the bug reporting addresses since you seem to be the maintainers of the fileutils and textutils packages. I sent an e-mail about a smaller facet of the same problem to [EMAIL PROTECTED] on 7-Dec-2000, but never received a response. I am using Linux Mandrake 8.1, with glibc 2.2.4, fileutils 4.1 and textutils 2.0.14 with the ISO 8859-1 character set (en_GB). I noticed that ls(1) and sort(1) were not ordering things the way I expected: # touch X ab a-c # ls ab a-c X # cat > z X ab a-c ^D # sort < z ab a-c X To summarise, both ls(1) and sort(1) are ignoring dashes and treating upper and lower-case characters as equivalent. I have a number of programs that are designed based on the assumption that sort(1) will sort things by byte ordering, and I'm sure a lot of other people have similarly-dependent programs. I also expect ls(1) to list files beginning with upper-case first letters before those beginning with lower-case letters, and I'm sure a lot of other people are used to that to (AFAIK it's always been done that way on UNIX, for as long as there _were_ lower case letters :) I have discussed this at length with the Mandrake developers, and they have told me that the change in behaviour is due to advances made in glibc with respect to LC_COLLATE handling under ISO 8859-1, and that this form of collation is intended to be more logical for people to read. I agree that it is more logical, but I do not think it should be the default behaviour for sort(1) and ls(1). Although this currently only affects Mandrake v8.1 (AFAIK), as more cautious distributions adopt more recent versions of glibc more and more people are going to experience this. It can be kludged by exporting the environment variable LC_COLLATE=POSIX, but that prevents collation from working at all. What I think would be the best solution is if ls(1) and sort(1) (and possibly other programs in textutils) were designed to sort by byte-ordering by default, and were given an option to use the locale-based collation. The existing options for sort(1) include: Ordering options: -b, --ignore-leading-blanks ignore leading blanks -d, --dictionary-order consider only blanks and alphanumeric characters -f, --ignore-case fold lower case to upper case characters -g, --general-numeric-sort compare according to general numerical value -i, --ignore-nonprinting consider only printable characters -M, --month-sort compare (unknown) < `JAN' < ... < `DEC' -n, --numeric-sort compare according to string numerical value -r, --reverse reverse the result of comparisons Which all suggest that the intent behind the sort program is to do byte- ordering unless otherwise directed. The --ignore-case option, for instance, is now meaningless under ISO 8859-1 because LC_COLLATE makes upper and lower-cased letters equivalent. The man page for sort(1) states: *** WARNING *** The locale specified by the environment affects sort order. Set LC_ALL=C to get the traditional sort order that uses native byte values. But this had not previously been apparent because LC_COLLATE did not work properly. I realise that I can fix it by exporting LC_COLLATE=POSIX, but I'm sure I'm not the only one who has assumed that byte ordering would remain the default action. Would you consider adding additional options, for instance: -l, --use-locale use the ordering specified by the current locale in LC_COLLATE instead of byte ordering And returning the default behaviour to byte ordering? Similarly the ls(1) man page states that the default action is to sort file names alphabetically, and makes no mention of locales. I believe that this is the right thing to do because it preserves the existing and expected behaviour, but allows the user to specify locale- based collation if they want to. I think that this is something that should be specified explicitly. Many Thanks, Corin /------------------------+-------------------------------------\ | Corin Hartland-Swann | Tel: +44 (0) 20 7491 2000 | | Commerce Internet Ltd | Fax: +44 (0) 20 7491 2010 | | 22 Cavendish Buildings | Mobile: +44 (0) 79 5854 0027 | | Gilbert Street | | | Mayfair | Web: http://www.commerce.uk.net/ | | London W1K 5HJ | E-Mail: [EMAIL PROTECTED] | \------------------------+-------------------------------------/ _______________________________________________ Bug-fileutils mailing list [EMAIL PROTECTED] http://mail.gnu.org/mailman/listinfo/bug-fileutils