It looks like many of the lines end with a carriage return, newline
(\r\n), while the others end
with only a newline.   Is it possible the other tools are ignoring line
ending differences?

-David



Ian Sue Wing wrote:

>Greetings,
>
>Yesterday I downloaded and installed a copy of CYGWIN. I am using the 
>uniq utility to purge duplicate line entries from a large, tab-delimited 
>file with several columns of data. (The file, which I have already run 
>through sort, is included as a .bz2 attachment. It has about 60,000 lines.)
>
>I have examined the file visually in a text editor, and confirmed that 
>it has duplicate lines. I have loaded the file into excel and calculated 
>that there are about 8700 duplicate lines. However, in the CYGWIN Bash 
>shell, typing
>
>uniq test_file_for_uniq > foo; diff test_file_for_uniq foo
>
>shows no changes between the files. Examining the uniquified file 'foo' 
>in excel reveals it to be identical to the original.
>
>I then fired up my trusty old MKS Toolkit and ran its implementation of 
>uniq. Running MKS visual diff on the original and uniquified files 
>identified about 8700 line differences, consistent with my earlier 
>calculations.
>
>Is this a bug in CYGWIN's implementation of uniq or a or a silly error 
>on my part? Last I checked, uniq was simple, straightforward to use, and 
>had nuclear-hardened reliability.
>
>-i
>
>  
>
>------------------------------------------------------------------------
>
>_______________________________________________
>Bug-coreutils mailing list
>[email protected]
>http://lists.gnu.org/mailman/listinfo/bug-coreutils
>  
>


-- 
---------------------------------------------------------
D a v i d  E i s n e r        c r a d l e @ u m d . e d u   
CALCE EPSC                         University of Maryland    



_______________________________________________
Bug-coreutils mailing list
[email protected]
http://lists.gnu.org/mailman/listinfo/bug-coreutils

Reply via email to