I report a feature of uniq which seems IMHO to be a bug: I am using test files containing the following lines:
tsttmp1: 2/dl1/f01 lnk2/f01 Benvenue house, Kat in blue dress on back porch 2/dl1/f02 lnk2/f02 Palm Springs, CA ???? 2/dl1/f03 lnk2/f03 Amerivox company picnic, Palo Alto, CA 2/dl1/f03a lnk2/f03a Amerivox company picnic, Palo Alto, CA 2/dl1/f04 lnk2/f04 Europe but where? 2/dl1/f04a lnk2/f04 Europe but where? 2/dl1/f05 lnk2/f05 Carol and Faith trip to Spain, etc. 2/dl1/f06 lnk2/f06 Carol and Faith trip tsttmp2: 2/dl1/f01 lnk2/f01 Benvenue house, Kat in blue dress on back porch 2/dl1/f02 lnk2/f02 Palm Springs, CA ???? 2/dl1/f03 lnk2/f03 Amerivox company picnic, Palo Alto, CA 2/dl1/f03a lnk2/f03a Amerivox company picnic, Palo Alto, CA 2/dl1/f04 lnk2/f04 Europe but where? 2/dl1/f04a lnk2/f04 Europe but where? 2/dl1/f05 lnk2/f05 Carol and Faith trip to Spain, etc. 2/dl1/f06 lnk2/f06 Carol and Faith trip Note that both files contain a pair of lines having 'lnk2/f04' as the second field. The space between fields in both files is strings of space characters. No tabs are used. I use the commands: $ uniq -f 1 -W 1 -D tsttmp1 and $ uniq -f 1 -W 1 -D tsttmp2 In both commands, the options call for examining _only_ field 2, and should report two duplicate lines in both files. But not so. There is no report of duplicates for tsttmp1. And there is a report of two duplicate lines for tsttmp2. I believe that the actual program treats a field as beginning with the first blank after a non-blank character. This behavior is the standard behavior for 'sort', but is inconsistent with 'info coreutils uniq', which states that a field begins with the first non-blank character after a string of blanks. What keeps there from being a report for tsttmp1 is the differing number of leading blanks in the two lines. I suggest a fix for this in uniq: 1/ change the documenatation to accurately describe the actual behavior. 2/ add an option, -b, to uniq that tells it to ignore leading blanks in a field, as is available in sort. Cheers, -- Paul E Condon [EMAIL PROTECTED] _______________________________________________ Bug-coreutils mailing list [email protected] http://lists.gnu.org/mailman/listinfo/bug-coreutils
