bug#22236: Not exactly a bug...

Assaf Gordon Fri, 25 Dec 2015 15:38:51 -0800

tag 22236 notabug
close 22236
thanks

Hello Todd,

> On Dec 25, 2015, at 13:37, Todd Shandelman <[email protected]> wrote:

[...]

> So it looks like that for chars, 'uniq' has options to compare only the first 
> N chars, or *all but* the first N chars.

> 
> Whereas for fields, 'uniq' has only the option to skip the first N fields, 
> but has no corresponding option to compare *only* the first N fields.
> 
> Why this lack of symmetry?

This lack of symmetry originates from the POSIX standard:
  http://pubs.opengroup.org/onlinepubs/9699919799/utilities/uniq.html
Which codified the existing features at that time.

GNU Coreutils' uniq program have added few more features, and there is a 
working plan to add the ability to use specific fields ( 
http://lists.gnu.org/archive/html/coreutils/2013-02/msg00082.html , 
http://lists.gnu.org/archive/html/coreutils/2013-09/msg00047.html ) but this 
has not yet been integrated into the main program - perhaps in future versions.

> And what do I do when I need that missing functionality, to compare only an 
> initial subset of fields in each line?

To print unique lines of specific fields you can use 'sort':

Example, given the following sample input file:

    $ cat input.txt
    1   A       10      x       100
    5   B       14      z       104
    2   A       11      x       101
    3   B       12      y       102
    4   B       13      z       103

Print only lines with unique values in columns 2 and 4:

    $ sort -k2,2 -k4,4 -s -u input.txt

    1   A       10      x       100
    3   B       12      y       102
    5   B       14      z       104

This can be extended to include as many fields as you need.
If the fields are consecutive, you can specify them as so:

    $ cat input2.txt
    A   x       1       97
    B   x       1       96
    A   x       1       99
    A   x       1       98

    $ sort -k1,3 -u input2.txt 
    A   x       1       97
    B   x       1       96

regards,
 - assaf

bug#22236: Not exactly a bug...

Reply via email to