Okay, here is a small contribution to the list.  Markus, this script:

grep "Total weight =" m:\imail\spool\spam\log\dec0628.log | gawk "{print $2,
$NF}" > log0628.txt

will output a file called log0628.txt in the following space delimited
format (snip):

16:35:17 64
16:35:29 78
16:35:39 0
16:36:10 1
16:36:35 69
16:36:39 -13
16:36:50 90
16:36:51 37
16:36:55 74

As Markus noted, the UNIX utilities needed for to run these scripts can be
found at: http://unxutils.sourceforge.net/  There is no installation, just
simply extract the files contained in the zip file into a directory and
you're all set.

Here are a couple of additional scripts to get you thinking about the power
of these utilities, which hopefully people will share with the list as they
develop their own scripts.  The following script will list all of your
Declude tests and show how many messages were flagged by the test:

egrep "Message OK|Msg failed" m:\imail\spool\spam\log\dec0615.log | gawk
"{print $6}" | sort | uniq -c | sort -rn

This will output a report like the following, in less than 30 seconds (if
any of you have run some of the other JunkMail log reporting tools, you will
find this quite extraordinary in comparison to the hours it takes to run
reports with these other reporting tools):

   9870 SPAMCHECK
   8827 NOLEGITCONTENT
   8082 IPNOTINMX
   7728 SM-SPAM-L1
   7466 SM-SPAM-L2
   7154 SPAMSNIFFER
   6793 WEIGHT36->
   6541 SM-SPAM-L3
   5749 REYNOLDS
   5698 HEADERS-FILTER
   5058 EASYNET-DNSBL
   4867 SM-SPAM-L4
   3932 SUBJECT-FILTER
   3762 BODY-FILTER
   3610 OSSRC
   2973 SPAMHAUS
   2902 OK
   2827 SPAMCOP
   2759 NJABL
   2605 OSSOFT
   2497 SM-SPAM-L5
   2480 INTERSIL
   1807 NOMOREFUNN
   1486 VOX
   1420 BLARSBL
   1300 FIVETEN-SRC
   1290 MAILFROM-FILTER
   1203 NOABUSE
   1188 NOPOSTMASTER
   1077 HELO-FILTER
   1070 REVDNS
   1010 DSBL
    952 SORBS
    919 EASYNET-PROXIES
    783 DSN
    726 MONKEYPROXIES
    689 BADHEADERS
    680 HEURISTICS
    680 HELOBOGUS
    651 WEIGHT16-35
    642 REVDNS-FILTER
    422 SPAMBAG
    416 BLITZEDALL
    397 SPAMDOMAINS
    391 LONGSUBJECT
    356 ROUTING
    306 OSPROXY
    306 FIVETEN-OPTIN
    300 COMMENTS
    294 IPWHOIS
    267 SUBJECTSPACES
    247 UCEB
    228 SM-ADULT-L1
    221 SM-ADULT-L2
    217 SM-ADULT-L3
    210 BASE64
    182 SM-ADULT-L4
    178 LEADMON
    149 SM-ADULT-L5
    140 MAILFROM
    114 BH-CHINA
     97 FABEL
     71 KOREA-NETS
     71 KITHRUP
     71 BH-KOREA
     68 BONDEDSENDER
     62 EASYNET-DYNA
     55 DSBL-MULTI
     54 SPAMHEADERS
     53 PIGS
     52 OSRELAY
     51 ORDB
     44 BH-JAPAN
     34 OSDIPS
     32 BH-ARGENTINA
     29 BH-RUSSIA
     27 BH-BRAZIL
     18 BH-TAIWAN
     18 BH-HONGKONG
     16 KUNDENSERVER
     14 BH-THAILAND
     10 DNSRBL-DUN
      8 EXSILIA-SPAM
      7 FIVETEN-MULTI
      4 NONENGLISH
      3 REMOTEIP-FILTER
      3 BH-MALAYSIA
      1 OSLIST
      1 BH-SINGAPORE

The following script will allow you to view the subject line of all messages
flagged by whatever test you define in the script (in this case I used
"SORBS"), and will sort them by count:

egrep "Msg failed SORBS|Subject:" m:\imail\spool\spam\log\dec0617.log |
grep -A 1 SORBS | grep Subject | cut -b 39- | sort -f | uniq -ic | sort -rfn

The output looks like (snip):

     10 Subject: You want a bigger one?
      9 Subject: Is your manhood too small?
      9 Subject: CheapTrips Airfares: Best Price Guaranteed
      8 Subject: prevent stretch marks during pregnancy
      8 Subject: Baby Boomers to GenX dhj k
      8 Subject: ##Low Income Funding Program vyig
      8 Subject: ##Low Income Funding Program h ymuviwtx  uggldu
      7 Subject: View Photos Of Sexy Singles In Your Area
      7 Subject: SUCCESS... dizaa
      7 Subject: rsvp-feel better guaranteed
      7 Subject: Earn $500 a Week Easily !
      6 Subject: Increase your Penis by 2 to 5 full inches in Weeks.
      6 Subject: Earn $2000 Weekly Easily!
      5 Subject: good news - accelerates recovery from athletic injury
      5 Subject: Bargain Shoes
      5 Subject: >#Government Loan Program### ryb o q

These scripts have to run all on one line, with no carriage returns, in
order to work properly.  Also, you will need to run these scripts from the
directory that you have extracted the UNIX utilities to.  This is because
some of the files have the same name as Windows utilities, like "sort" for
example.

Speaking of "sort", which is used is a couple of these scripts, there
appears to be about a 2mb size limitation on the content you are trying to
sort.  It will only be an issue if you log files are around 25mb or larger,
since the script is trying to sort on the output of the first grep command.
I have sent an e-mail to the developer asking him about this size
limitation, since there appears to be no size limitation on our Linux
machines, where I can run the same script on any size log file.

Have fun!

Bill

----- Original Message ----- 
From: "Markus Gufler" <[EMAIL PROTECTED]>
To: <[EMAIL PROTECTED]>
Sent: Saturday, June 28, 2003 1:17 AM
Subject: RE: [Declude.JunkMail] time-dependently hold weight


>
>
> > I've considered this a few times, every time I prepare to
> > suggest it I remember what happened with my idea to test for
> > long subjects, there just isn't enough uniformity.
>
> Well. Maybe my idea is expressed from "the wrong side".
> Watching the diagram I can also simply fathom that my current hold
> weight is a little bit too low.
> After adding some new SpamChk tests (we are currently testing) and some
> new RBL-lists, the average value has increased a little bit. So the only
> thing I have to do is to increase slightly the hold weight (or decrease
> the points for every single test)
>
> Remains the fact, that only 13% of our FP's whas recieved out of
> business time. If there is some way to detect the senders local current
> time or timezone this for sure will help again to reduce false positives
> or false negatives using a "time-dependently hold weight"
>
>
> > BTW, the graph is amazing, how is it made?
>
> Hmmm, it's not an "out of the box" tool, but maybe someone can develop
> it. I think it should be very easy but at the moment I'm not familiar
> with any RAD tool...
>
> So here the steps what I've done:
>
> 1.) grep all lines from the declude logfile containing "Total weight ="
> Grep.exe is part of the unixtools what you can find on
> http://unxutils.sourceforge.net/
> Don't fear to "install" this tools. You can also simply extract the
> zip-archive.
>
> C:\imail\spool\grep -U "Total weight =" dec0624.log >
> c:\imail\spool\tw0624.log
>
> This will create a new file tw0624.log in the spool folder containing
> only the lines with the total weight of any message processed by declude
> junkmail.
>
> Note: You need at least loglevel MID to see the "Total weight" lines in
> the logfile.
>
> 2.) Now I've "elaborated" my tw-file
> In the following original line
> 06/21/2003 00:01:42 Q843b181400780c01 HELOBOGUS:19 .  Total weight = 19
>
> a.) delete the date "06/21/2003 "
> 00:01:42 Q843b181400780c01 HELOBOGUS:19 .  Total weight = 19
>
> b.) replace the " Q" after the time with ";"
> 00:01:42;843b181400780c01 HELOBOGUS:19 .  Total weight = 19
>
> c.) replace the "Total weight = " with ";"
> 00:01:42;843b181400780c01 HELOBOGUS:19 .  ;19
>
> 3.) Now you have a CSV file with the time in the first and the weight in
> the third column.
> You can import this for example into MS Excel
>
> 4.) To "decode" the HH:MM:SS time format in something usable for a
> diagramm I've used the following formula:
> C1 = (HOUR(A1)*3600)+(MINUTE(A1)*60)+SECONDS(A1)
>
> This will give you in cell C1 the timecode in seconds
>
> 5.) Now you can play around with different diagrams, ...
> For example you can also sort all rows by the weight to create a graph
> like them attached to this message.
> This will show you if you have done a good job configuring the tests so
> that in the critical zone between 80 and 120% of your hold weight there
> are minimal messages. (high slope)
>
> I know looks like a lot of work, but it's done in few minutes and will
> give you a great view what's going on on your junkmail filter.
>
> All of this steps can be automizzed, if someone has time and knowledge
> to create a small reporting tool...
>
> Markus
>
>
>
>

---
[This E-mail was scanned for viruses by Declude Virus (http://www.declude.com)]

---
This E-mail came from the Declude.JunkMail mailing list.  To
unsubscribe, just send an E-mail to [EMAIL PROTECTED], and
type "unsubscribe Declude.JunkMail".  The archives can be found
at http://www.mail-archive.com.

Reply via email to