On Fri, Mar 11, 2005 at 03:05:55PM -0500, Ian Sue Wing wrote: > I have examined the file visually in a text editor,
You missed the fact that the file is not a Unix text file. It contains carriage-return characters, but only on some lines. There are 17637 carriage-return characters in the file. [...] > I then fired up my trusty old MKS Toolkit and ran its implementation of > uniq. Running MKS visual diff on the original and uniquified files > identified about 8700 line differences, consistent with my earlier > calculations. The MKS toolkit is intended to run on DOS. Therefore it is insensitive to the carriage-return characters. After the carriage returns are removed, that there are 8671 duplicated lines in the input file. However, before these are removed, the file contains no duplicate lines because the apparently-identical lines are distinguished by the fact that some of them contain a carriage-return character. > Is this a bug in CYGWIN's implementation of uniq or a or a silly error > on my part? Last I checked, uniq was simple, straightforward to use, and > had nuclear-hardened reliability. Yes, uniq is normally reliable, and in this case you would have been right to trust it. Carriage-returns are for the most part an insidious evil. Regards, James Youngman. _______________________________________________ Bug-coreutils mailing list Bug-coreutils@gnu.org http://lists.gnu.org/mailman/listinfo/bug-coreutils