> [email protected] wrote:
>>> Need a scripting help to sort out a list and list all the duplicate
>>> lines.
>>>
>>> My data looks somethings like this
>>>
>>> host6:dev406mum.dd.mum.test.com:22:11:11:no
>>> host7:dev258mum.dd.mum.test.com:36:17:19:no
>>> host7:dev258mum.dd.mum.test.com:36:17:19:no
>>> host17:dev258mum.dd.mum.test.com:31:17:19:no
>>> host12:dev258mum.dd.mum.test.com:41:17:19:no
>>> host2:dev258mum.dd.mum.test.com:36:17:19:no
>>> host4:dev258mum.dd.mum.test.com:41:17:19:no
>>> host4:dev258mum.dd.mum.test.com:45:17:19:no
>>> host4:dev258mum.dd.mum.test.com:36:17:19:no
>>>
>>> I need to sort this list and print all the lines where column 3 has a
>>> duplicate entry.
>>>
>>> I need to print the whole line, if a duplicate entry exists in column
>>> 3.
>>>
>>> I tried using a combination of "sort" and "uniq" but was not
>>> successful.
>>
>> list.awk
>> BEGIN {
>> FS=":";
>> }
>> { if ( $3 == last ) {
>>
>> print $0;
>> }
>> last = $3;
>> }
>>
>> sort <file> | awk -f list.awk
>>
>> mark "*how* long an awk script would you like?"
>
> This doesn't print the first of the duplicates. Also, the question
> wasn't clear as to whether every line with matching 3rd fields should be
> printed or just ones where the others or previous fields matched (but
> the sort options could control that).
Oh, sorry:
BEGIN {
FS=":";
}
{ if ( $3 == last ) {
if ( first == 0 ) {
print saved;
first++;
}
print $0;
}
else {
first = 0;
last = $3;
saved = $0;
}
}
mark "did I mention that I've written 100 -200 line awk scripts?"
_______________________________________________
CentOS mailing list
[email protected]
http://lists.centos.org/mailman/listinfo/centos