I'm trying to figure out how to use awk to parse values from a string of unknown length and unknown fields using awk, from within a shell script, and write those values to a file in a certain order.

Here's a typical string that I want to parse:

alert ip [50.0.0.0/8,100.0.0.0/6,104.0.0.0/5,112.0.0.0/6,173.0.0.0/8,174.0.0.0/7,176.0.0.0/5,184.0.0.0/6] any -> $HOME_NET any (msg:"ET POLICY Reserved IP Space Traffic - Bogon Nets 2"; classtype:bad-unknown; reference:url,www.cymru.com/Documents/bogon-list.html; threshold: type limit, track by_src, count 1, seconds 360; sid:2002750; rev:10;)

What I want to do is extract the value after "sid:", the value after "reference:" and the value after "msg:" and insert them into a file that would look like this:

2002750 || "ET POLICY Reserved IP Space Traffic - Bogon Nets 2" || url,www.cymru.com/Documents/bogon-list.html

Yes, I know I could do this easily in Perl. I'm doing this to try and improve my understanding of awk. I *think* I've figured out that the right approach is to use an associative array, and this command:

# awk '!/#/ { for (i=1; i<=NF; i++) { if ( $i ~ /sid/) {mtcmsg[sid]=$i; print mtcmsg[sid]}}}' < /usr/local/etc/snort/rules/mtc.rules.test

prodcues this data:
sid:299913;
sid:52123;
sid:3001441;
sid:1444;
sid:2008120;
sid:5001684;
sid:2001683;
sid:22466;
sid:2002750;
sid:3000003;
sid:292000032;
sid:22000032;
sid:3000000;
sid:2003070;
sid:2003484;
sid:2003603;
sid:31000004;
sid:299998;

So it appears (at least to me) that I'm on the right path, but I thought I'd query the awk gurus on the list. Is there a better way to approach this?

The standard FS breaks the msg into multiple fields, which is unacceptable. So my thinking is that I would need to do somthing like this (pseudocode)

!/#/; FS=";" {if ( $i ~ /sid/) then use tr to stip the "sid:" and ";" and insert the result into an element named sid
if ($i ~ /reference/) then ditto into an element named ref
if $i ~ /msg/) then ditto into an element named msg)
then print array[sid]" || "array[msg]" || " array[ref] > resulting file.}

But when I add an FS to the script, I get odd results:

# awk '!/#/ { FS=";"; for (i=1; i<=NF; i++) { if ( $i ~ /sid/) {mtcmsg[sid]=$i; print mtcmsg[sid]}}}' < /usr/local/etc/snort/rules/mtc.rules.test
sid:299913;
sid:52123
sid:3001441
sid:1444
sid:2008120
sid:5001684
sid:2001683
sid:22466
sid:2002750
sid:3000003
sid:292000032
sid:22000032
sid:3000000
sid:2003070
sid:2003484
sid:2003603
sid:31000004
sid:299998

Why is the first value indented and not stripped of the semi-colon?

--
Paul Schmehl, Senior Infosec Analyst
As if it wasn't already obvious, my opinions
are my own and not those of my employer.
*******************************************
"It is as useless to argue with those who have
renounced the use of reason as to administer
medication to the dead." Thomas Jefferson

_______________________________________________
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to "freebsd-questions-unsubscr...@freebsd.org"

Reply via email to