Re: numeric sort(1) is broken on -STABLE
On Thu, Feb 11, 2010 at 08:40:51AM +0100, Ulrich Spörlein wrote: On Wed, 10.02.2010 at 15:00:07 -0600, Dan Nelson wrote: In the last episode (Feb 10), Ulrich Spörlein said: On Wed, 10.02.2010 at 13:49:05 +0300, Ruslan Ermilov wrote: On Wed, Feb 10, 2010 at 09:58:14AM +0100, Ulrich Spörlein wrote: not sure if this is a pilot error, but it seems to me that gnu sort -n is broken on at least -STABLE (couldn't test -CURRENT yet). It somehow does not manifest when using a simple list and sorting on a specific column, but it always happens to me when using it in combination with find(1). % truncate -s10m a; truncate -s5m b; truncate -s800k c % find a b c -ls|sort -nk7,7 8 64 -rw-r--r--1 uqs wheel 10485760 Feb 10 09:13 a 10 64 -rw-r--r--1 uqs wheel 5242880 Feb 10 09:13 b 12 64 -rw-r--r--1 uqs wheel 819200 Feb 10 09:13 c I bet you're using some non-C locale for LC_NUMERIC. What does locale output tell you? Yes and no. LC_NUMERIC is still at C, LC_CTYPE is set to UTF-8, but as there are no non-ASCII symbols in that output it shouldn't matter, right? For me, 819200 is smaller than 10485760 in pretty much all locales. Why the hell is a numeric gnusort locale dependant? Why is -g working anyway? Try adding a 'b' to your sort flags. I bet the leading spaces in front of your numbers are being treated as part of the sort key. Maybe de_DE.UTF-8 and C have different ideas of what is whitespace? Indeed, 'b' is working too. So I've stocked up on the number of workarounds for this problem. What amazes me, is that no one seems to be as shocked as I to find out something basic like sorting on a number is not DTRT. It is a long standing issue with Russian locales as well, but there the problem manifests itself only with LC_NUMERIC, not LC_CTYPE. Cheers, -- Ruslan Ermilov r...@freebsd.org FreeBSD committer ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: numeric sort(1) is broken on -STABLE
On 2010/02/10 17:58, Ulrich Spörlein wrote: Hi guys, not sure if this is a pilot error, but it seems to me that gnu sort -n is broken on at least -STABLE (couldn't test -CURRENT yet). It somehow does not manifest when using a simple list and sorting on a specific column, but it always happens to me when using it in combination with find(1). % truncate -s10m a; truncate -s5m b; truncate -s800k c % find a b c -ls|sort -nk7,7 8 64 -rw-r--r--1 uqs wheel10485760 Feb 10 09:13 a 10 64 -rw-r--r--1 uqs wheel 5242880 Feb 10 09:13 b 12 64 -rw-r--r--1 uqs wheel 819200 Feb 10 09:13 c % find a b c -ls|sort -gk7,7 12 64 -rw-r--r--1 uqs wheel 819200 Feb 10 09:13 c 10 64 -rw-r--r--1 uqs wheel 5242880 Feb 10 09:13 b 8 64 -rw-r--r--1 uqs wheel10485760 Feb 10 09:13 a Hi, here is a patch I've submitted about 4 years ago... http://www.freebsd.org/cgi/query-pr.cgi?pr=gnu/93566 -- Kazuaki ODA ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
numeric sort(1) is broken on -STABLE
Hi guys, not sure if this is a pilot error, but it seems to me that gnu sort -n is broken on at least -STABLE (couldn't test -CURRENT yet). It somehow does not manifest when using a simple list and sorting on a specific column, but it always happens to me when using it in combination with find(1). % truncate -s10m a; truncate -s5m b; truncate -s800k c % find a b c -ls|sort -nk7,7 8 64 -rw-r--r--1 uqs wheel10485760 Feb 10 09:13 a 10 64 -rw-r--r--1 uqs wheel 5242880 Feb 10 09:13 b 12 64 -rw-r--r--1 uqs wheel 819200 Feb 10 09:13 c % find a b c -ls|sort -gk7,7 12 64 -rw-r--r--1 uqs wheel 819200 Feb 10 09:13 c 10 64 -rw-r--r--1 uqs wheel 5242880 Feb 10 09:13 b 8 64 -rw-r--r--1 uqs wheel10485760 Feb 10 09:13 a at least -g does what is expected and I can work around this for the time being. Here's bsdsort % find a b c -ls|bsdsort -nk7,7 12 64 -rw-r--r--1 uqs wheel 819200 Feb 10 09:13 c 10 64 -rw-r--r--1 uqs wheel 5242880 Feb 10 09:13 b 8 64 -rw-r--r--1 uqs wheel10485760 Feb 10 09:13 a and this is on Solaris 8 % find a b c -ls|sort -nk7,7 546728 16 -rw-r--r-- 1 spoerul xxx819200 Feb 10 09:49 c 546727 16 -rw-r--r-- 1 spoerul xxx 5242880 Feb 10 09:48 b 546724 16 -rw-r--r-- 1 spoerul xxx 10485760 Feb 10 09:48 a It even occured to me, that we don't have a sort regression suite under tools/regression. Anyone know a place to find one with a suitable license? Regards, Uli ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: numeric sort(1) is broken on -STABLE
On Wed, Feb 10, 2010 at 09:58:14AM +0100, Ulrich Spörlein wrote: Hi guys, not sure if this is a pilot error, but it seems to me that gnu sort -n is broken on at least -STABLE (couldn't test -CURRENT yet). It somehow does not manifest when using a simple list and sorting on a specific column, but it always happens to me when using it in combination with find(1). % truncate -s10m a; truncate -s5m b; truncate -s800k c % find a b c -ls|sort -nk7,7 8 64 -rw-r--r--1 uqs wheel10485760 Feb 10 09:13 a 10 64 -rw-r--r--1 uqs wheel 5242880 Feb 10 09:13 b 12 64 -rw-r--r--1 uqs wheel 819200 Feb 10 09:13 c % find a b c -ls|sort -gk7,7 12 64 -rw-r--r--1 uqs wheel 819200 Feb 10 09:13 c 10 64 -rw-r--r--1 uqs wheel 5242880 Feb 10 09:13 b 8 64 -rw-r--r--1 uqs wheel10485760 Feb 10 09:13 a I can't repro this on 8.0-STABLE or 7.2-STABLE. /usr/bin/sort on these machines is what comes with the base system (which is GNU coreutils sort). Maybe your issue is related to locale(1) variables? $ uname -a FreeBSD icarus.home.lan 8.0-STABLE FreeBSD 8.0-STABLE #0: Sat Jan 16 17:48:04 PST 2010 r...@icarus.home.lan:/usr/obj/usr/src/sys/X7SBA_RELENG_8_amd64 amd64 $ truncate -s10m a; truncate -s5m b; truncate -s800k c $ find a b c -ls | /usr/bin/sort -nk7,7 30781 -rw---1 jdc users 819200 10 Feb 01:11 c 30771 -rw---1 jdc users 5242880 10 Feb 01:11 b 30761 -rw---1 jdc users10485760 10 Feb 01:11 a $ find a b c -ls | /usr/bin/sort -gk7,7 30781 -rw---1 jdc users 819200 10 Feb 01:11 c 30771 -rw---1 jdc users 5242880 10 Feb 01:11 b 30761 -rw---1 jdc users10485760 10 Feb 01:11 a $ /usr/bin/sort --version sort (GNU coreutils) 5.3.0-20040812-FreeBSD Written by Mike Haertel and Paul Eggert. Copyright (C) 2004 Free Software Foundation, Inc. This is free software; see the source for copying conditions. There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. $ locale LANG=en_GB.ISO8859-1 LC_CTYPE=en_GB.ISO8859-1 LC_COLLATE=C LC_TIME=en_GB.ISO8859-1 LC_NUMERIC=en_GB.ISO8859-1 LC_MONETARY=en_GB.ISO8859-1 LC_MESSAGES=en_GB.ISO8859-1 LC_ALL= $ uname -a FreeBSD horus.parodius.com 7.2-STABLE FreeBSD 7.2-STABLE #0: Sat Jan 9 07:52:27 PST 2010 r...@horus.sc1.parodius.com:/usr/obj/usr/src/sys/PDSMI_PLUS_RELENG_7_amd64 amd64 $ truncate -s10m a; truncate -s5m b; truncate -s800k c $ find a b c -ls | /usr/bin/sort -nk7,7 4061321 -rw---1 jdc users 819200 10 Feb 01:13 c 4061311 -rw---1 jdc users 5242880 10 Feb 01:13 b 4061301 -rw---1 jdc users10485760 10 Feb 01:13 a $ find a b c -ls | /usr/bin/sort -gk7,7 4061321 -rw---1 jdc users 819200 10 Feb 01:13 c 4061311 -rw---1 jdc users 5242880 10 Feb 01:13 b 4061301 -rw---1 jdc users10485760 10 Feb 01:13 a $ /usr/bin/sort --version sort (GNU coreutils) 5.3.0-20040812-FreeBSD Written by Mike Haertel and Paul Eggert. Copyright (C) 2004 Free Software Foundation, Inc. This is free software; see the source for copying conditions. There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. $ locale LANG=en_GB.ISO8859-1 LC_CTYPE=en_GB.ISO8859-1 LC_COLLATE=C LC_TIME=en_GB.ISO8859-1 LC_NUMERIC=en_GB.ISO8859-1 LC_MONETARY=en_GB.ISO8859-1 LC_MESSAGES=en_GB.ISO8859-1 LC_ALL= -- | Jeremy Chadwick j...@parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, USA | | Making life hard for others since 1977. PGP: 4BD6C0CB | ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: numeric sort(1) is broken on -STABLE
On Wed, Feb 10, 2010 at 09:58:14AM +0100, Ulrich Spörlein wrote: Hi guys, not sure if this is a pilot error, but it seems to me that gnu sort -n is broken on at least -STABLE (couldn't test -CURRENT yet). It somehow does not manifest when using a simple list and sorting on a specific column, but it always happens to me when using it in combination with find(1). % truncate -s10m a; truncate -s5m b; truncate -s800k c % find a b c -ls|sort -nk7,7 8 64 -rw-r--r--1 uqs wheel10485760 Feb 10 09:13 a 10 64 -rw-r--r--1 uqs wheel 5242880 Feb 10 09:13 b 12 64 -rw-r--r--1 uqs wheel 819200 Feb 10 09:13 c I bet you're using some non-C locale for LC_NUMERIC. What does locale output tell you? % find a b c -ls|sort -gk7,7 12 64 -rw-r--r--1 uqs wheel 819200 Feb 10 09:13 c 10 64 -rw-r--r--1 uqs wheel 5242880 Feb 10 09:13 b 8 64 -rw-r--r--1 uqs wheel10485760 Feb 10 09:13 a at least -g does what is expected and I can work around this for the time being. Here's bsdsort % find a b c -ls|bsdsort -nk7,7 12 64 -rw-r--r--1 uqs wheel 819200 Feb 10 09:13 c 10 64 -rw-r--r--1 uqs wheel 5242880 Feb 10 09:13 b 8 64 -rw-r--r--1 uqs wheel10485760 Feb 10 09:13 a and this is on Solaris 8 % find a b c -ls|sort -nk7,7 546728 16 -rw-r--r-- 1 spoerul xxx819200 Feb 10 09:49 c 546727 16 -rw-r--r-- 1 spoerul xxx 5242880 Feb 10 09:48 b 546724 16 -rw-r--r-- 1 spoerul xxx 10485760 Feb 10 09:48 a It even occured to me, that we don't have a sort regression suite under tools/regression. Anyone know a place to find one with a suitable license? Regards, Uli ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org -- Ruslan Ermilov r...@freebsd.org FreeBSD committer ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: numeric sort(1) is broken on -STABLE
On Wed, 10.02.2010 at 13:49:05 +0300, Ruslan Ermilov wrote: On Wed, Feb 10, 2010 at 09:58:14AM +0100, Ulrich Spörlein wrote: Hi guys, not sure if this is a pilot error, but it seems to me that gnu sort -n is broken on at least -STABLE (couldn't test -CURRENT yet). It somehow does not manifest when using a simple list and sorting on a specific column, but it always happens to me when using it in combination with find(1). % truncate -s10m a; truncate -s5m b; truncate -s800k c % find a b c -ls|sort -nk7,7 8 64 -rw-r--r--1 uqs wheel10485760 Feb 10 09:13 a 10 64 -rw-r--r--1 uqs wheel 5242880 Feb 10 09:13 b 12 64 -rw-r--r--1 uqs wheel 819200 Feb 10 09:13 c I bet you're using some non-C locale for LC_NUMERIC. What does locale output tell you? Yes and no. LC_NUMERIC is still at C, LC_CTYPE is set to UTF-8, but as there are no non-ASCII symbols in that output it shouldn't matter, right? For me, 819200 is smaller than 10485760 in pretty much all locales. Why the hell is a numeric gnusort locale dependant? Why is -g working anyway? % locale LANG= LC_CTYPE=de_DE.UTF-8 LC_COLLATE=C LC_TIME=C LC_NUMERIC=C LC_MONETARY=C LC_MESSAGES=C LC_ALL= % find a b c -ls | LC_ALL=C sort -nk7,7 12 64 -rw-r--r--1 uqs wheel 819200 Feb 10 09:13 c 10 64 -rw-r--r--1 uqs wheel 5242880 Feb 10 09:13 b 8 64 -rw-r--r--1 uqs wheel10485760 Feb 10 09:13 a Great, now I'm even more angry at sort(1) than before ... Regards, Uli ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: numeric sort(1) is broken on -STABLE
In the last episode (Feb 10), Ulrich Spörlein said: On Wed, 10.02.2010 at 13:49:05 +0300, Ruslan Ermilov wrote: On Wed, Feb 10, 2010 at 09:58:14AM +0100, Ulrich Spörlein wrote: not sure if this is a pilot error, but it seems to me that gnu sort -n is broken on at least -STABLE (couldn't test -CURRENT yet). It somehow does not manifest when using a simple list and sorting on a specific column, but it always happens to me when using it in combination with find(1). % truncate -s10m a; truncate -s5m b; truncate -s800k c % find a b c -ls|sort -nk7,7 8 64 -rw-r--r--1 uqs wheel 10485760 Feb 10 09:13 a 10 64 -rw-r--r--1 uqs wheel 5242880 Feb 10 09:13 b 12 64 -rw-r--r--1 uqs wheel 819200 Feb 10 09:13 c I bet you're using some non-C locale for LC_NUMERIC. What does locale output tell you? Yes and no. LC_NUMERIC is still at C, LC_CTYPE is set to UTF-8, but as there are no non-ASCII symbols in that output it shouldn't matter, right? For me, 819200 is smaller than 10485760 in pretty much all locales. Why the hell is a numeric gnusort locale dependant? Why is -g working anyway? Try adding a 'b' to your sort flags. I bet the leading spaces in front of your numbers are being treated as part of the sort key. Maybe de_DE.UTF-8 and C have different ideas of what is whitespace? -- Dan Nelson dnel...@allantgroup.com ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: numeric sort(1) is broken on -STABLE
On Wed, 10.02.2010 at 15:00:07 -0600, Dan Nelson wrote: In the last episode (Feb 10), Ulrich Spörlein said: On Wed, 10.02.2010 at 13:49:05 +0300, Ruslan Ermilov wrote: On Wed, Feb 10, 2010 at 09:58:14AM +0100, Ulrich Spörlein wrote: not sure if this is a pilot error, but it seems to me that gnu sort -n is broken on at least -STABLE (couldn't test -CURRENT yet). It somehow does not manifest when using a simple list and sorting on a specific column, but it always happens to me when using it in combination with find(1). % truncate -s10m a; truncate -s5m b; truncate -s800k c % find a b c -ls|sort -nk7,7 8 64 -rw-r--r--1 uqs wheel 10485760 Feb 10 09:13 a 10 64 -rw-r--r--1 uqs wheel 5242880 Feb 10 09:13 b 12 64 -rw-r--r--1 uqs wheel 819200 Feb 10 09:13 c I bet you're using some non-C locale for LC_NUMERIC. What does locale output tell you? Yes and no. LC_NUMERIC is still at C, LC_CTYPE is set to UTF-8, but as there are no non-ASCII symbols in that output it shouldn't matter, right? For me, 819200 is smaller than 10485760 in pretty much all locales. Why the hell is a numeric gnusort locale dependant? Why is -g working anyway? Try adding a 'b' to your sort flags. I bet the leading spaces in front of your numbers are being treated as part of the sort key. Maybe de_DE.UTF-8 and C have different ideas of what is whitespace? Indeed, 'b' is working too. So I've stocked up on the number of workarounds for this problem. What amazes me, is that no one seems to be as shocked as I to find out something basic like sorting on a number is not DTRT. Bye, Uli ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org