Re: problem with command sort after uniq -c
You're right, my locale is set to fr_FR. I've tried with en_EN and en_US, and it works fine (and with -k1,1 too). I think I understand the problem with the locale fr_FR : in french, to write 123456.78 in a easily readable form you write 123 456,78 (and in english it's 123,456.78). Thanks Philip and Andreas for your answers (and sorry for polluting the bug mailing list). Damien Andreas Schwab a écrit : Damien ANCELIN <[EMAIL PROTECTED]> writes: I met a problem with the sort command : I've used the uniq command with the -c option to count some numbers, and then applying sort -n don't sort lines by numeric order of the first field. Here is an example (my sort version is 5.97) : $ cat bug_sort | sort -n This is a useless use of cat, you can just redirect sort's standard input from the file. 1320 51970 1692 12345 22681 8060 26063 8649 2668 33603 3487 44496 4350 23246 47013 8000 5447 2 81724 5000 I assume that you use the fr_FR locale. In this locale a number can be grouped with a space, thus it is considered part of the number. If you want to be sure that sort only considers the first field as sort key you should use -k1,1 to limit it. The default is to always use the the whole line as sort key, and sort -n will take as much as possible from the key to match a number. Andreas. -- Damien ANCELIN INRIA - ENS-Lyon, LIP (RESO) Bureau 322 Sud Tel : +33 4 72 72 85 02 ___ Bug-coreutils mailing list Bug-coreutils@gnu.org http://lists.gnu.org/mailman/listinfo/bug-coreutils
Re: problem with command sort after uniq -c
Bauke Jan Douma wrote: > What might have been the case here, and which is a > situation that I find myself in sometimes, is this: > you want to do 'filter1 FILE | filter2' > (or 'filter1 isn't what's to be expected. You investigate, and > part of that is temporarily substituting filter1 for > plain cat and the command becomes 'cat FILE | filter2'. > > Most of the time this is on the command-line. On your own command line is fine. It is your command line. No one else would ever see it. The objections come in when people write these into scripts and into test cases and share these around to other people. Many people have a belief that cat into a pipe is the only way to do it. I have seen hundreds of lines written this way in a single script! It is a misunderstanding. Educating users to improve their programming abilities is just one of the many burdens that must be endured, or else endure the endless burden of even more programs written poorly with these misconceptions. Bob ___ Bug-coreutils mailing list Bug-coreutils@gnu.org http://lists.gnu.org/mailman/listinfo/bug-coreutils
Re: problem with command sort after uniq -c
Andreas Schwab wrote on 10-03-08 19:54: Damien ANCELIN <[EMAIL PROTECTED]> writes: I met a problem with the sort command : I've used the uniq command with the -c option to count some numbers, and then applying sort -n don't sort lines by numeric order of the first field. Here is an example (my sort version is 5.97) : $ cat bug_sort | sort -n This is a useless use of cat, you can just redirect sort's standard input from the file. True, but such constructs do happen. What might have been the case here, and which is a situation that I find myself in sometimes, is this: you want to do 'filter1 FILE | filter2' (or 'filter1 http://lists.gnu.org/mailman/listinfo/bug-coreutils
Re: problem with command sort after uniq -c
Damien ANCELIN <[EMAIL PROTECTED]> writes: > I met a problem with the sort command : I've used the uniq command with > the -c option to count some numbers, and then applying sort -n don't sort > lines by numeric order of the first field. > Here is an example (my sort version is 5.97) : > $ cat bug_sort | sort -n This is a useless use of cat, you can just redirect sort's standard input from the file. > 1320 51970 > 1692 12345 > 22681 8060 > 26063 8649 > 2668 33603 > 3487 44496 > 4350 23246 > 47013 8000 > 5447 2 > 81724 5000 I assume that you use the fr_FR locale. In this locale a number can be grouped with a space, thus it is considered part of the number. If you want to be sure that sort only considers the first field as sort key you should use -k1,1 to limit it. The default is to always use the the whole line as sort key, and sort -n will take as much as possible from the key to match a number. Andreas. -- Andreas Schwab, SuSE Labs, [EMAIL PROTECTED] SuSE Linux Products GmbH, Maxfeldstraße 5, 90409 Nürnberg, Germany PGP key fingerprint = 58CA 54C7 6D53 942B 1756 01D3 44D5 214B 8276 4ED5 "And now for something completely different." ___ Bug-coreutils mailing list Bug-coreutils@gnu.org http://lists.gnu.org/mailman/listinfo/bug-coreutils
Re: problem with command sort after uniq -c
On Mon, 10 Mar 2008, Damien ANCELIN wrote: I met a problem with the sort command : I've used the uniq command with the -c option to count some numbers, and then applying sort -n don't sort lines by numeric order of the first field. Here is an example (my sort version is 5.97) : $ cat bug_sort | sort -n 1320 51970 1692 12345 22681 8060 26063 8649 2668 33603 3487 44496 4350 23246 47013 8000 5447 2 81724 5000 You don't say which locale your environment is configured to use for sorting, but I'd bet it's one which treats whitespace differently to how you expect. With only spaces between the 2 fields, sort -n read 1 number per line and use it to do the sort : 2668 33603 is read as 266833603. With this consideration, the result of sort is correct, but it's not what I expected (and I didn't see this behaviour in the documentation). The command "sort -n" treats the whole line as the sort key. Specifying "sort -k1,1n" will use just the first field, in ascending numerical order. Cheers, Phil ___ Bug-coreutils mailing list Bug-coreutils@gnu.org http://lists.gnu.org/mailman/listinfo/bug-coreutils
problem with command sort after uniq -c
Hello, I met a problem with the sort command : I've used the uniq command with the -c option to count some numbers, and then applying sort -n don't sort lines by numeric order of the first field. Here is an example (my sort version is 5.97) : $ cat bug_sort | sort -n 1320 51970 1692 12345 22681 8060 26063 8649 2668 33603 3487 44496 4350 23246 47013 8000 5447 2 81724 5000 If I add a non-numeric and non-space character between the 2 fields, sort -n works properly : $ cat bug_sort | sed "s/\([0-9]\) \([0-9]\)/\1 -\2/" | sort -n 1320 -51970 1692 -12345 2668 -33603 3487 -44496 4350 -23246 5447 -2 22681 -8060 26063 -8649 47013 -8000 81724 -5000 With only spaces between the 2 fields, sort -n read 1 number per line and use it to do the sort : 2668 33603 is read as 266833603. With this consideration, the result of sort is correct, but it's not what I expected (and I didn't see this behaviour in the documentation). Regards, Damien -- Damien ANCELIN INRIA - ENS-Lyon, LIP (RESO) Bureau 322 Sud Tel : +33 4 72 72 85 02 ___ Bug-coreutils mailing list Bug-coreutils@gnu.org http://lists.gnu.org/mailman/listinfo/bug-coreutils