Re: numeric sort(1) is broken on -STABLE

2010-02-11 Thread Ruslan Ermilov
On Thu, Feb 11, 2010 at 08:40:51AM +0100, Ulrich Spörlein wrote:
 On Wed, 10.02.2010 at 15:00:07 -0600, Dan Nelson wrote:
  In the last episode (Feb 10), Ulrich Spörlein said:
   On Wed, 10.02.2010 at 13:49:05 +0300, Ruslan Ermilov wrote:
On Wed, Feb 10, 2010 at 09:58:14AM +0100, Ulrich Spörlein wrote:
 not sure if this is a pilot error, but it seems to me that gnu sort -n
 is broken on at least -STABLE (couldn't test -CURRENT yet).
 
 It somehow does not manifest when using a simple list and sorting on a
 specific column, but it always happens to me when using it in
 combination with find(1).
 
 % truncate -s10m a; truncate -s5m b; truncate -s800k c
 % find a b c -ls|sort -nk7,7
  8   64 -rw-r--r--1 uqs  wheel
 10485760 Feb 10 09:13 a
 10   64 -rw-r--r--1 uqs  wheel 
 5242880 Feb 10 09:13 b
 12   64 -rw-r--r--1 uqs  wheel  
 819200 Feb 10 09:13 c

I bet you're using some non-C locale for LC_NUMERIC.  What does locale
output tell you?
   
   Yes and no. LC_NUMERIC is still at C, LC_CTYPE is set to UTF-8, but as
   there are no non-ASCII symbols in that output it shouldn't matter, right? 
   For me, 819200 is smaller than 10485760 in pretty much all locales.  Why
   the hell is a numeric gnusort locale dependant?  Why is -g working anyway?
  
  Try adding a 'b' to your sort flags.  I bet the leading spaces in front of
  your numbers are being treated as part of the sort key.  Maybe de_DE.UTF-8
  and C have different ideas of what is whitespace?
 
 Indeed, 'b' is working too. So I've stocked up on the number of
 workarounds for this problem. What amazes me, is that no one seems to be
 as shocked as I to find out something basic like sorting on a number is
 not DTRT.

It is a long standing issue with Russian locales as well, but there
the problem manifests itself only with LC_NUMERIC, not LC_CTYPE.


Cheers,
-- 
Ruslan Ermilov
r...@freebsd.org
FreeBSD committer
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: numeric sort(1) is broken on -STABLE

2010-02-11 Thread Kazuaki ODA

On 2010/02/10 17:58, Ulrich Spörlein wrote:

Hi guys,

not sure if this is a pilot error, but it seems to me that gnu sort -n
is broken on at least -STABLE (couldn't test -CURRENT yet).

It somehow does not manifest when using a simple list and sorting on a
specific column, but it always happens to me when using it in
combination with find(1).

% truncate -s10m a; truncate -s5m b; truncate -s800k c
% find a b c -ls|sort -nk7,7
  8   64 -rw-r--r--1 uqs  wheel10485760 Feb 
10 09:13 a
 10   64 -rw-r--r--1 uqs  wheel 5242880 Feb 
10 09:13 b
 12   64 -rw-r--r--1 uqs  wheel  819200 Feb 
10 09:13 c
% find a b c -ls|sort -gk7,7
 12   64 -rw-r--r--1 uqs  wheel  819200 Feb 
10 09:13 c
 10   64 -rw-r--r--1 uqs  wheel 5242880 Feb 
10 09:13 b
  8   64 -rw-r--r--1 uqs  wheel10485760 Feb 
10 09:13 a


Hi, here is a patch I've submitted about 4 years ago...
http://www.freebsd.org/cgi/query-pr.cgi?pr=gnu/93566

--
Kazuaki ODA
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


numeric sort(1) is broken on -STABLE

2010-02-10 Thread Ulrich Spörlein
Hi guys,

not sure if this is a pilot error, but it seems to me that gnu sort -n
is broken on at least -STABLE (couldn't test -CURRENT yet).

It somehow does not manifest when using a simple list and sorting on a
specific column, but it always happens to me when using it in
combination with find(1).

% truncate -s10m a; truncate -s5m b; truncate -s800k c
% find a b c -ls|sort -nk7,7
 8   64 -rw-r--r--1 uqs  wheel10485760 Feb 
10 09:13 a
10   64 -rw-r--r--1 uqs  wheel 5242880 Feb 
10 09:13 b
12   64 -rw-r--r--1 uqs  wheel  819200 Feb 
10 09:13 c
% find a b c -ls|sort -gk7,7
12   64 -rw-r--r--1 uqs  wheel  819200 Feb 
10 09:13 c
10   64 -rw-r--r--1 uqs  wheel 5242880 Feb 
10 09:13 b
 8   64 -rw-r--r--1 uqs  wheel10485760 Feb 
10 09:13 a

at least -g does what is expected and I can work around this for the time 
being. Here's bsdsort

% find a b c -ls|bsdsort -nk7,7
12   64 -rw-r--r--1 uqs  wheel  819200 Feb 
10 09:13 c
10   64 -rw-r--r--1 uqs  wheel 5242880 Feb 
10 09:13 b
 8   64 -rw-r--r--1 uqs  wheel10485760 Feb 
10 09:13 a

and this is on Solaris 8

% find a b c -ls|sort -nk7,7
546728   16 -rw-r--r--   1 spoerul xxx819200 Feb 10 09:49 c
546727   16 -rw-r--r--   1 spoerul xxx   5242880 Feb 10 09:48 b
546724   16 -rw-r--r--   1 spoerul xxx  10485760 Feb 10 09:48 a

It even occured to me, that we don't have a sort regression suite under
tools/regression. Anyone know a place to find one with a suitable license?

Regards,
Uli
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: numeric sort(1) is broken on -STABLE

2010-02-10 Thread Jeremy Chadwick
On Wed, Feb 10, 2010 at 09:58:14AM +0100, Ulrich Spörlein wrote:
 Hi guys,
 
 not sure if this is a pilot error, but it seems to me that gnu sort -n
 is broken on at least -STABLE (couldn't test -CURRENT yet).
 
 It somehow does not manifest when using a simple list and sorting on a
 specific column, but it always happens to me when using it in
 combination with find(1).
 
 % truncate -s10m a; truncate -s5m b; truncate -s800k c
 % find a b c -ls|sort -nk7,7
  8   64 -rw-r--r--1 uqs  wheel10485760 
 Feb 10 09:13 a
 10   64 -rw-r--r--1 uqs  wheel 5242880 
 Feb 10 09:13 b
 12   64 -rw-r--r--1 uqs  wheel  819200 
 Feb 10 09:13 c
 % find a b c -ls|sort -gk7,7
 12   64 -rw-r--r--1 uqs  wheel  819200 
 Feb 10 09:13 c
 10   64 -rw-r--r--1 uqs  wheel 5242880 
 Feb 10 09:13 b
  8   64 -rw-r--r--1 uqs  wheel10485760 
 Feb 10 09:13 a

I can't repro this on 8.0-STABLE or 7.2-STABLE.  /usr/bin/sort on these
machines is what comes with the base system (which is GNU coreutils
sort).

Maybe your issue is related to locale(1) variables?

$ uname -a
FreeBSD icarus.home.lan 8.0-STABLE FreeBSD 8.0-STABLE #0: Sat Jan 16 17:48:04 
PST 2010 r...@icarus.home.lan:/usr/obj/usr/src/sys/X7SBA_RELENG_8_amd64  
amd64
$ truncate -s10m a; truncate -s5m b; truncate -s800k c
$ find a b c -ls | /usr/bin/sort -nk7,7
  30781 -rw---1 jdc  users  819200 10 
Feb 01:11 c
  30771 -rw---1 jdc  users 5242880 10 
Feb 01:11 b
  30761 -rw---1 jdc  users10485760 10 
Feb 01:11 a
$ find a b c -ls | /usr/bin/sort -gk7,7
  30781 -rw---1 jdc  users  819200 10 
Feb 01:11 c
  30771 -rw---1 jdc  users 5242880 10 
Feb 01:11 b
  30761 -rw---1 jdc  users10485760 10 
Feb 01:11 a
$ /usr/bin/sort --version
sort (GNU coreutils) 5.3.0-20040812-FreeBSD
Written by Mike Haertel and Paul Eggert.

Copyright (C) 2004 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
$ locale
LANG=en_GB.ISO8859-1
LC_CTYPE=en_GB.ISO8859-1
LC_COLLATE=C
LC_TIME=en_GB.ISO8859-1
LC_NUMERIC=en_GB.ISO8859-1
LC_MONETARY=en_GB.ISO8859-1
LC_MESSAGES=en_GB.ISO8859-1
LC_ALL=


$ uname -a
FreeBSD horus.parodius.com 7.2-STABLE FreeBSD 7.2-STABLE #0: Sat Jan  9 
07:52:27 PST 2010 
r...@horus.sc1.parodius.com:/usr/obj/usr/src/sys/PDSMI_PLUS_RELENG_7_amd64  
amd64
$ truncate -s10m a; truncate -s5m b; truncate -s800k c
$ find a b c -ls | /usr/bin/sort -nk7,7
4061321 -rw---1 jdc  users  819200 10 
Feb 01:13 c
4061311 -rw---1 jdc  users 5242880 10 
Feb 01:13 b
4061301 -rw---1 jdc  users10485760 10 
Feb 01:13 a
$ find a b c -ls | /usr/bin/sort -gk7,7
4061321 -rw---1 jdc  users  819200 10 
Feb 01:13 c
4061311 -rw---1 jdc  users 5242880 10 
Feb 01:13 b
4061301 -rw---1 jdc  users10485760 10 
Feb 01:13 a
$ /usr/bin/sort --version
sort (GNU coreutils) 5.3.0-20040812-FreeBSD
Written by Mike Haertel and Paul Eggert.

Copyright (C) 2004 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
$ locale
LANG=en_GB.ISO8859-1
LC_CTYPE=en_GB.ISO8859-1
LC_COLLATE=C
LC_TIME=en_GB.ISO8859-1
LC_NUMERIC=en_GB.ISO8859-1
LC_MONETARY=en_GB.ISO8859-1
LC_MESSAGES=en_GB.ISO8859-1
LC_ALL=

-- 
| Jeremy Chadwick   j...@parodius.com |
| Parodius Networking   http://www.parodius.com/ |
| UNIX Systems Administrator  Mountain View, CA, USA |
| Making life hard for others since 1977.  PGP: 4BD6C0CB |

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: numeric sort(1) is broken on -STABLE

2010-02-10 Thread Ruslan Ermilov
On Wed, Feb 10, 2010 at 09:58:14AM +0100, Ulrich Spörlein wrote:
 Hi guys,
 
 not sure if this is a pilot error, but it seems to me that gnu sort -n
 is broken on at least -STABLE (couldn't test -CURRENT yet).
 
 It somehow does not manifest when using a simple list and sorting on a
 specific column, but it always happens to me when using it in
 combination with find(1).
 
 % truncate -s10m a; truncate -s5m b; truncate -s800k c
 % find a b c -ls|sort -nk7,7
  8   64 -rw-r--r--1 uqs  wheel10485760 
 Feb 10 09:13 a
 10   64 -rw-r--r--1 uqs  wheel 5242880 
 Feb 10 09:13 b
 12   64 -rw-r--r--1 uqs  wheel  819200 
 Feb 10 09:13 c

I bet you're using some non-C locale for LC_NUMERIC.
What does locale output tell you?

 % find a b c -ls|sort -gk7,7
 12   64 -rw-r--r--1 uqs  wheel  819200 
 Feb 10 09:13 c
 10   64 -rw-r--r--1 uqs  wheel 5242880 
 Feb 10 09:13 b
  8   64 -rw-r--r--1 uqs  wheel10485760 
 Feb 10 09:13 a
 
 at least -g does what is expected and I can work around this for the time 
 being. Here's bsdsort
 
 % find a b c -ls|bsdsort -nk7,7
 12   64 -rw-r--r--1 uqs  wheel  819200 
 Feb 10 09:13 c
 10   64 -rw-r--r--1 uqs  wheel 5242880 
 Feb 10 09:13 b
  8   64 -rw-r--r--1 uqs  wheel10485760 
 Feb 10 09:13 a
 
 and this is on Solaris 8
 
 % find a b c -ls|sort -nk7,7
 546728   16 -rw-r--r--   1 spoerul xxx819200 Feb 10 09:49 c
 546727   16 -rw-r--r--   1 spoerul xxx   5242880 Feb 10 09:48 b
 546724   16 -rw-r--r--   1 spoerul xxx  10485760 Feb 10 09:48 a
 
 It even occured to me, that we don't have a sort regression suite under
 tools/regression. Anyone know a place to find one with a suitable license?
 
 Regards,
 Uli
 ___
 freebsd-stable@freebsd.org mailing list
 http://lists.freebsd.org/mailman/listinfo/freebsd-stable
 To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
 

-- 
Ruslan Ermilov
r...@freebsd.org
FreeBSD committer
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: numeric sort(1) is broken on -STABLE

2010-02-10 Thread Ulrich Spörlein
On Wed, 10.02.2010 at 13:49:05 +0300, Ruslan Ermilov wrote:
 On Wed, Feb 10, 2010 at 09:58:14AM +0100, Ulrich Spörlein wrote:
  Hi guys,
  
  not sure if this is a pilot error, but it seems to me that gnu sort -n
  is broken on at least -STABLE (couldn't test -CURRENT yet).
  
  It somehow does not manifest when using a simple list and sorting on a
  specific column, but it always happens to me when using it in
  combination with find(1).
  
  % truncate -s10m a; truncate -s5m b; truncate -s800k c
  % find a b c -ls|sort -nk7,7
   8   64 -rw-r--r--1 uqs  wheel10485760 
  Feb 10 09:13 a
  10   64 -rw-r--r--1 uqs  wheel 5242880 
  Feb 10 09:13 b
  12   64 -rw-r--r--1 uqs  wheel  819200 
  Feb 10 09:13 c
 
 I bet you're using some non-C locale for LC_NUMERIC.
 What does locale output tell you?

Yes and no. LC_NUMERIC is still at C, LC_CTYPE is set to UTF-8, but as
there are no non-ASCII symbols in that output it shouldn't matter,
right? For me, 819200 is smaller than 10485760 in pretty much all
locales. Why the hell is a numeric gnusort locale dependant? Why is -g
working anyway?

% locale
LANG=
LC_CTYPE=de_DE.UTF-8
LC_COLLATE=C
LC_TIME=C
LC_NUMERIC=C
LC_MONETARY=C
LC_MESSAGES=C
LC_ALL=

% find a b c -ls | LC_ALL=C sort -nk7,7
12   64 -rw-r--r--1 uqs  wheel  819200 Feb 
10 09:13 c
10   64 -rw-r--r--1 uqs  wheel 5242880 Feb 
10 09:13 b
 8   64 -rw-r--r--1 uqs  wheel10485760 Feb 
10 09:13 a

Great, now I'm even more angry at sort(1) than before ...

Regards,
Uli
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: numeric sort(1) is broken on -STABLE

2010-02-10 Thread Dan Nelson
In the last episode (Feb 10), Ulrich Spörlein said:
 On Wed, 10.02.2010 at 13:49:05 +0300, Ruslan Ermilov wrote:
  On Wed, Feb 10, 2010 at 09:58:14AM +0100, Ulrich Spörlein wrote:
   not sure if this is a pilot error, but it seems to me that gnu sort -n
   is broken on at least -STABLE (couldn't test -CURRENT yet).
   
   It somehow does not manifest when using a simple list and sorting on a
   specific column, but it always happens to me when using it in
   combination with find(1).
   
   % truncate -s10m a; truncate -s5m b; truncate -s800k c
   % find a b c -ls|sort -nk7,7
8   64 -rw-r--r--1 uqs  wheel
   10485760 Feb 10 09:13 a
   10   64 -rw-r--r--1 uqs  wheel 
   5242880 Feb 10 09:13 b
   12   64 -rw-r--r--1 uqs  wheel  
   819200 Feb 10 09:13 c
  
  I bet you're using some non-C locale for LC_NUMERIC.  What does locale
  output tell you?
 
 Yes and no. LC_NUMERIC is still at C, LC_CTYPE is set to UTF-8, but as
 there are no non-ASCII symbols in that output it shouldn't matter, right? 
 For me, 819200 is smaller than 10485760 in pretty much all locales.  Why
 the hell is a numeric gnusort locale dependant?  Why is -g working anyway?

Try adding a 'b' to your sort flags.  I bet the leading spaces in front of
your numbers are being treated as part of the sort key.  Maybe de_DE.UTF-8
and C have different ideas of what is whitespace?

-- 
Dan Nelson
dnel...@allantgroup.com
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: numeric sort(1) is broken on -STABLE

2010-02-10 Thread Ulrich Spörlein
On Wed, 10.02.2010 at 15:00:07 -0600, Dan Nelson wrote:
 In the last episode (Feb 10), Ulrich Spörlein said:
  On Wed, 10.02.2010 at 13:49:05 +0300, Ruslan Ermilov wrote:
   On Wed, Feb 10, 2010 at 09:58:14AM +0100, Ulrich Spörlein wrote:
not sure if this is a pilot error, but it seems to me that gnu sort -n
is broken on at least -STABLE (couldn't test -CURRENT yet).

It somehow does not manifest when using a simple list and sorting on a
specific column, but it always happens to me when using it in
combination with find(1).

% truncate -s10m a; truncate -s5m b; truncate -s800k c
% find a b c -ls|sort -nk7,7
 8   64 -rw-r--r--1 uqs  wheel
10485760 Feb 10 09:13 a
10   64 -rw-r--r--1 uqs  wheel 
5242880 Feb 10 09:13 b
12   64 -rw-r--r--1 uqs  wheel  
819200 Feb 10 09:13 c
   
   I bet you're using some non-C locale for LC_NUMERIC.  What does locale
   output tell you?
  
  Yes and no. LC_NUMERIC is still at C, LC_CTYPE is set to UTF-8, but as
  there are no non-ASCII symbols in that output it shouldn't matter, right? 
  For me, 819200 is smaller than 10485760 in pretty much all locales.  Why
  the hell is a numeric gnusort locale dependant?  Why is -g working anyway?
 
 Try adding a 'b' to your sort flags.  I bet the leading spaces in front of
 your numbers are being treated as part of the sort key.  Maybe de_DE.UTF-8
 and C have different ideas of what is whitespace?

Indeed, 'b' is working too. So I've stocked up on the number of
workarounds for this problem. What amazes me, is that no one seems to be
as shocked as I to find out something basic like sorting on a number is
not DTRT.

Bye,
Uli
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org