Matthew Jarvis wrote:
> If I had a list of net traffic with several thousand IP addresses, is
> there a quick/easy way to programmatically produce a list of what the
> matching domains are?
>
> For example, 65.61.159.9 resolves to www.bikefriday.com
>
> I want to nail a few abusers around here, or at least bring it to their
> attention that I'm watching.... (actually, I'm not anymore, but they
> won't know that...<g>)
First, let's pull the IP address out of the log files.
$ sub='s/.*[^0-9]([0-9]+\.[0-9]+\.[0-9]+\.[0-9]+).*/\1/p'
$ sed -rn "$sub" logfile1 logfile2 logfile3
Second, let's sort them by popularity.
$ sed -rn "$sub" logs | uniq -c | sort -n
Too many. Let's just get the most popular ones.
$ sed -rn "$sub" logs | uniq -c | sort -n | tail -n 20
Now we can resolve those IP addresses into DNS names.
$ sed -rn "$sub" logs | uniq -c | sort -n | tail -n 20 | \
> sed -r 's/(.*) (.*)/echo "\1" `host \2`/'
$ sed -rn "$sub" logs | uniq -c | sort -n | tail -n 20 | \
> sed -r 's/(.*) (.*)/echo "\1" `host \2`/' | sh
Ugly output. Clean it up.
$ sed -rn "$sub" logs | uniq -c | sort -n | tail -n 20 | \
> sed -r 's/(.*) (.*)/echo "\1" `host \2`/' | sh | \
> awk '/NXDOMAIN/ { print $1, $2 } !/NXDOMAIN/ { print $1, $NF }'
Finally, you can put all that into a shell script so you
never have to type it again.
$ cat > bin/tophosts
#!/bin/sh
sed -rn 's/.*[^0-9]([0-9]+\.[0-9]+\.[0-9]+\.[0-9]+).*/\1/p' ${1+"$@"} |
uniq -c |
sort -n |
tail -n 20 |
sed -r 's/(.*) (.*)/echo \1 \2 `host \2`/' |
sh |
awk '/NXDOMAIN/ { print $1, $2 } !/NXDOMAIN/ { print $1, $NF }'
^D
$ chmod +x bin/tophosts
$ tophosts
30 216-210-236-195.atgi.net.
31 host248.orcasinc.com.
31 211.101.226.193
36 host248.orcasinc.com.
36 relato.pro.br.
... et cetera ...
Whose turn is it to advocate Python today?
--
Bob Miller K<bob>
[EMAIL PROTECTED]
_______________________________________________
EUGLUG mailing list
[email protected]
http://www.euglug.org/mailman/listinfo/euglug