The sort utility program (written by Mike Haertel) that came with my Red Hat LInux 6.2 (obtained about April 2000) and that is part of the GNU textutils 2.0a (December 1999) seems to have a fundamental flaw.
 
When the 'n' (for numeric) option is used either globally (as '-n') or for a particular field, the result is incorrect in many instances.  For example, the following data (two fields separated by a colon):
 
9:2020
2:900
5:900
1:1000
10:1350
9:1200
1:1100
2:850
4:950
1:2800
3:950
2:1200
5:950
3:800
 
when sorted numerically on the first field ONLY ('sort -t : -k 1,1n datafile' or 'sort -t : -n -k 1,1 datafile' ), outputs the following sorted file:
 
2:850
2:900
3:800
3:950
4:950
5:900
5:950
1:1000
1:1100
1:2800
2:1200
9:1200
9:2020
10:1350
 
That is, a numerical sort purely on the first field depends on whether there are 3 or 4 characters in the next field.  Try it for your self!  Other wierd things happen in other circumstances.  Please find the test sort file 'datafile.txt' attached.
 
I look forward to your response.
 
Dr Craig A Martin
9:2020
2:900
5:900
1:1000
10:1350
9:1200
1:1100
2:850
4:950
1:2800
3:950
2:1200
5:950
3:800

Reply via email to