Okay, here is a small contribution to the list. Markus, this script: grep "Total weight =" m:\imail\spool\spam\log\dec0628.log | gawk "{print $2, $NF}" > log0628.txt
will output a file called log0628.txt in the following space delimited format (snip): 16:35:17 64 16:35:29 78 16:35:39 0 16:36:10 1 16:36:35 69 16:36:39 -13 16:36:50 90 16:36:51 37 16:36:55 74 As Markus noted, the UNIX utilities needed for to run these scripts can be found at: http://unxutils.sourceforge.net/ There is no installation, just simply extract the files contained in the zip file into a directory and you're all set. Here are a couple of additional scripts to get you thinking about the power of these utilities, which hopefully people will share with the list as they develop their own scripts. The following script will list all of your Declude tests and show how many messages were flagged by the test: egrep "Message OK|Msg failed" m:\imail\spool\spam\log\dec0615.log | gawk "{print $6}" | sort | uniq -c | sort -rn This will output a report like the following, in less than 30 seconds (if any of you have run some of the other JunkMail log reporting tools, you will find this quite extraordinary in comparison to the hours it takes to run reports with these other reporting tools): 9870 SPAMCHECK 8827 NOLEGITCONTENT 8082 IPNOTINMX 7728 SM-SPAM-L1 7466 SM-SPAM-L2 7154 SPAMSNIFFER 6793 WEIGHT36-> 6541 SM-SPAM-L3 5749 REYNOLDS 5698 HEADERS-FILTER 5058 EASYNET-DNSBL 4867 SM-SPAM-L4 3932 SUBJECT-FILTER 3762 BODY-FILTER 3610 OSSRC 2973 SPAMHAUS 2902 OK 2827 SPAMCOP 2759 NJABL 2605 OSSOFT 2497 SM-SPAM-L5 2480 INTERSIL 1807 NOMOREFUNN 1486 VOX 1420 BLARSBL 1300 FIVETEN-SRC 1290 MAILFROM-FILTER 1203 NOABUSE 1188 NOPOSTMASTER 1077 HELO-FILTER 1070 REVDNS 1010 DSBL 952 SORBS 919 EASYNET-PROXIES 783 DSN 726 MONKEYPROXIES 689 BADHEADERS 680 HEURISTICS 680 HELOBOGUS 651 WEIGHT16-35 642 REVDNS-FILTER 422 SPAMBAG 416 BLITZEDALL 397 SPAMDOMAINS 391 LONGSUBJECT 356 ROUTING 306 OSPROXY 306 FIVETEN-OPTIN 300 COMMENTS 294 IPWHOIS 267 SUBJECTSPACES 247 UCEB 228 SM-ADULT-L1 221 SM-ADULT-L2 217 SM-ADULT-L3 210 BASE64 182 SM-ADULT-L4 178 LEADMON 149 SM-ADULT-L5 140 MAILFROM 114 BH-CHINA 97 FABEL 71 KOREA-NETS 71 KITHRUP 71 BH-KOREA 68 BONDEDSENDER 62 EASYNET-DYNA 55 DSBL-MULTI 54 SPAMHEADERS 53 PIGS 52 OSRELAY 51 ORDB 44 BH-JAPAN 34 OSDIPS 32 BH-ARGENTINA 29 BH-RUSSIA 27 BH-BRAZIL 18 BH-TAIWAN 18 BH-HONGKONG 16 KUNDENSERVER 14 BH-THAILAND 10 DNSRBL-DUN 8 EXSILIA-SPAM 7 FIVETEN-MULTI 4 NONENGLISH 3 REMOTEIP-FILTER 3 BH-MALAYSIA 1 OSLIST 1 BH-SINGAPORE The following script will allow you to view the subject line of all messages flagged by whatever test you define in the script (in this case I used "SORBS"), and will sort them by count: egrep "Msg failed SORBS|Subject:" m:\imail\spool\spam\log\dec0617.log | grep -A 1 SORBS | grep Subject | cut -b 39- | sort -f | uniq -ic | sort -rfn The output looks like (snip): 10 Subject: You want a bigger one? 9 Subject: Is your manhood too small? 9 Subject: CheapTrips Airfares: Best Price Guaranteed 8 Subject: prevent stretch marks during pregnancy 8 Subject: Baby Boomers to GenX dhj k 8 Subject: ##Low Income Funding Program vyig 8 Subject: ##Low Income Funding Program h ymuviwtx uggldu 7 Subject: View Photos Of Sexy Singles In Your Area 7 Subject: SUCCESS... dizaa 7 Subject: rsvp-feel better guaranteed 7 Subject: Earn $500 a Week Easily ! 6 Subject: Increase your Penis by 2 to 5 full inches in Weeks. 6 Subject: Earn $2000 Weekly Easily! 5 Subject: good news - accelerates recovery from athletic injury 5 Subject: Bargain Shoes 5 Subject: >#Government Loan Program### ryb o q These scripts have to run all on one line, with no carriage returns, in order to work properly. Also, you will need to run these scripts from the directory that you have extracted the UNIX utilities to. This is because some of the files have the same name as Windows utilities, like "sort" for example. Speaking of "sort", which is used is a couple of these scripts, there appears to be about a 2mb size limitation on the content you are trying to sort. It will only be an issue if you log files are around 25mb or larger, since the script is trying to sort on the output of the first grep command. I have sent an e-mail to the developer asking him about this size limitation, since there appears to be no size limitation on our Linux machines, where I can run the same script on any size log file. Have fun! Bill ----- Original Message ----- From: "Markus Gufler" <[EMAIL PROTECTED]> To: <[EMAIL PROTECTED]> Sent: Saturday, June 28, 2003 1:17 AM Subject: RE: [Declude.JunkMail] time-dependently hold weight > > > > I've considered this a few times, every time I prepare to > > suggest it I remember what happened with my idea to test for > > long subjects, there just isn't enough uniformity. > > Well. Maybe my idea is expressed from "the wrong side". > Watching the diagram I can also simply fathom that my current hold > weight is a little bit too low. > After adding some new SpamChk tests (we are currently testing) and some > new RBL-lists, the average value has increased a little bit. So the only > thing I have to do is to increase slightly the hold weight (or decrease > the points for every single test) > > Remains the fact, that only 13% of our FP's whas recieved out of > business time. If there is some way to detect the senders local current > time or timezone this for sure will help again to reduce false positives > or false negatives using a "time-dependently hold weight" > > > > BTW, the graph is amazing, how is it made? > > Hmmm, it's not an "out of the box" tool, but maybe someone can develop > it. I think it should be very easy but at the moment I'm not familiar > with any RAD tool... > > So here the steps what I've done: > > 1.) grep all lines from the declude logfile containing "Total weight =" > Grep.exe is part of the unixtools what you can find on > http://unxutils.sourceforge.net/ > Don't fear to "install" this tools. You can also simply extract the > zip-archive. > > C:\imail\spool\grep -U "Total weight =" dec0624.log > > c:\imail\spool\tw0624.log > > This will create a new file tw0624.log in the spool folder containing > only the lines with the total weight of any message processed by declude > junkmail. > > Note: You need at least loglevel MID to see the "Total weight" lines in > the logfile. > > 2.) Now I've "elaborated" my tw-file > In the following original line > 06/21/2003 00:01:42 Q843b181400780c01 HELOBOGUS:19 . Total weight = 19 > > a.) delete the date "06/21/2003 " > 00:01:42 Q843b181400780c01 HELOBOGUS:19 . Total weight = 19 > > b.) replace the " Q" after the time with ";" > 00:01:42;843b181400780c01 HELOBOGUS:19 . Total weight = 19 > > c.) replace the "Total weight = " with ";" > 00:01:42;843b181400780c01 HELOBOGUS:19 . ;19 > > 3.) Now you have a CSV file with the time in the first and the weight in > the third column. > You can import this for example into MS Excel > > 4.) To "decode" the HH:MM:SS time format in something usable for a > diagramm I've used the following formula: > C1 = (HOUR(A1)*3600)+(MINUTE(A1)*60)+SECONDS(A1) > > This will give you in cell C1 the timecode in seconds > > 5.) Now you can play around with different diagrams, ... > For example you can also sort all rows by the weight to create a graph > like them attached to this message. > This will show you if you have done a good job configuring the tests so > that in the critical zone between 80 and 120% of your hold weight there > are minimal messages. (high slope) > > I know looks like a lot of work, but it's done in few minutes and will > give you a great view what's going on on your junkmail filter. > > All of this steps can be automizzed, if someone has time and knowledge > to create a small reporting tool... > > Markus > > > > --- [This E-mail was scanned for viruses by Declude Virus (http://www.declude.com)] --- This E-mail came from the Declude.JunkMail mailing list. To unsubscribe, just send an E-mail to [EMAIL PROTECTED], and type "unsubscribe Declude.JunkMail". The archives can be found at http://www.mail-archive.com.