The sort utility program (written by Mike Haertel) that
came with my Red Hat LInux 6.2 (obtained about April 2000) and that is part of
the GNU textutils 2.0a (December 1999) seems to have a fundamental
flaw.
When the 'n' (for numeric) option is used either globally (as
'-n') or for a particular field, the result is incorrect in many
instances. For example, the following data (two fields separated by a
colon):
9:2020
2:900
5:900
1:1000
10:1350
9:1200
1:1100
2:850
4:950
1:2800
3:950
2:1200
5:950
3:800
when sorted numerically on the first field ONLY ('sort -t : -k
1,1n datafile' or 'sort -t : -n -k 1,1 datafile' ), outputs the following sorted
file:
2:850
2:900
3:800
3:950
4:950
5:900
5:950
1:1000
1:1100
1:2800
2:1200
9:1200
9:2020
10:1350
That is, a numerical sort purely on the first field
depends on whether there are 3 or 4 characters in the next field. Try it
for your self! Other wierd things happen in other circumstances.
Please find the test sort file 'datafile.txt' attached.
I look forward to your response.
Dr Craig A
Martin |
9:2020 2:900 5:900 1:1000 10:1350 9:1200 1:1100 2:850 4:950 1:2800 3:950 2:1200 5:950 3:800