On 5/24/05, Mike Odegard <[EMAIL PROTECTED]> wrote:
> I need help creating a script to pull out information of a log file.
> This file is on our System Log server.
> 
> After I grep to a subset of the file for some tcp/ip protocol, such as
> smtp, ftp, etc., I need to run a script on the subset file to extract
> all the IP addresses, creating a list, dropping duplicates, but then
> showing the line count for each matching IP address.
> 
> For example, each record has 'src=xxx.xxx.xxx.xxx' with the IP address
> of the requesting machine.
> I want to pull out all of the IP addresses on the 'src=' field, dropping
> duplicates.
> Then count how many lines for each IP address found in the previous
> step, such as found with 'wc -l'.
> 
> 10.43.223.44   143
> 10.67.11.329   2402
> 10.11.5.208     8

"dropping duplicates" and "count how many lines for each" don't fit
together too well.  I think you meant "report unique IP addresses and
how many of each".

Here is a bit of script that I got from comp.lang.awk, I think.  It
pulls IP adresses out of fairly arbitrary messed-up text streams.  The
technique is to use tr(1) to get rid of everything that is not a digit
or a dot by mapping them to newline (\012) and squeezing out duplicate
newlines.  The resulting stream is then filtered by awk(1) to select
lines that have exactly four fields with numeric values suitable to be
an IP address.

tr -cs '[0-9\.]' '\012' |
awk -F'.' 'NF==4 && $1>0 && $1<=255 && $2<=255 && $3<=255 && $4<=255
&& !/\.\./'

Piping the output of this to sort | uniq -c should do the job.

    carl
-- 
    carl lowenstein         marine physical lab     u.c. san diego
                                                 [EMAIL PROTECTED]


--
[email protected]
http://www.kernel-panic.org/cgi-bin/mailman/listinfo/kplug-list

Reply via email to