Public bug reported:
Binary package hint: coreutils
Tested with sort 5.2.1 from coreutils 5.2.1-2ubuntu0 and sort 5.93 from
5.93-5ubuntu4.
sort's -u (unique) option isn't working as expected and the info (spit!)
documentation doesn't match its behaviour either so I'd expect one or
the other to change.
Given:
$ echo 1/3,1/2,1/1,2/1 | tr , \\012 | sort -t / -k 1,2
1/1
1/2
1/3
2/1
OK. Add -n (numeric):
$ echo 1/3,1/2,1/1,2/1 | tr , \\012 | sort -n -t / -k 1,2
1/1
1/2
1/3
2/1
Still OK. Add -u (unique):
$ echo 1/3,1/2,1/1,2/1 | tr , \\012 | sort -nu -t / -k 1,2
1/3
2/1
Despite sorting on fields 1 to 2 inclusive the unqiueness has been
judged on just field 1. At least that's my guess at what's happening,
backed up by:
$ echo 1/3,1/2,1/1,2/1 | tr , \\012 | sort -u -t / -k 1,1
1/3
2/1
and:
$ echo 1/3,1/2,1/1,2/1 | tr , \\012 | sort -u -t / -k 2,2
1/1
1/2
1/3
I'd expect that if sorting on N fields, lines omitted due to -u have to
match the line output in all N fields. The manual doesn't suggest
anything different:
`-u'
`--unique'
Normally, output only the first of a sequence of lines that
compare equal. For the `--check' (`-c') option, check that no
pair of consecutive lines compares equal.
The lines clearly don't compare equal since without -u it reverses the
order of the first three lines of input. Consequently, all three should
be output with -u. The POSIX spec. seems to support the info
documentation:
http://www.opengroup.org/onlinepubs/000095399/utilities/sort.html
At the very least this seems to be a documentation fix but I believe the
behaviour is wrong. The impact of it can be serious, i.e. I'm in the
middle of sorting a list of files to backup!
** Affects: coreutils (Ubuntu)
Importance: Untriaged
Status: Unconfirmed
--
sort's -u is Failing to Check all -k fields for Uniqueness.
https://launchpad.net/bugs/56891
--
ubuntu-bugs mailing list
[email protected]
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs