At present it is necessary to use both sort and uniq if you want a
tabulated list of unique items. As an example, I needed to do so for an
error log to determine what errors were most frequent:
sort .xsession-errors | uniq --count | sort -n
If I were interested only in the unique lines without counts I could
have used sort -u, but that does not give me any feel for how frequent a
line is in the file. The above works well as long as the file is not
too large, but obviously takes a lot of both time and temporary space if
the file is huge. If sort could count the lines while keeping only the
unique lines there would be only one pass through the file and no
excessive temporary space would be needed.
My incentive for processing this particular log file was some sort of
run-away error logging that actually filled the partition the log file
was on, at which point the file was about 7G of data. I wound up using
tail to take the last million lines and getting the most frequent errors
from that and assuming that this way typical, but of course it would
have been nice to know.
Dave