Following up, I noticed a pattern among the outputs of| sort | uniq -u versus
| sort u:
The three files that I evaluated had 26.1GB, 12GB, and 2.0GB, repectively,
among
1. The original file, the result of grepping about 10GB of nMap output files,
with many duplicates;
2. The | sort -u file; and
3. The | sort | uniq -u file, the smallest of the three.
I applied comm (with no arguments):
comm IPv6-uniq.lns01.v6.018.net.il.txt IPv6-uniqB.lns01.v6.018.net.il.txt >
IPv6-commAll.lns01.v6.018.net.il.txt
An excerpt from this last script's output is attached; it has no Column $2
(files unique to
the second (smaller) file; Column $3 (the less well represented among the two
files) has nothing
obviously different from the entries above & below.
Not to contradict man uniq's description of uniq -u, but I'm suspicious. I'll
be using sort -u
from now on.