Thank you Marcus for the very informative analysis.  I see that we're
not using some of the more accurate tests (because our global.cfg file
is a little out of date).  A number of these tests are not defined in
Declude's example glogal.cfg file.  Can you supply a global.cfg (or part
of one) with an example test definition for each of these tests?

Thanks,

Todd Holt
Xidix Technologies, Inc
Las Vegas, NV USA
702.319.4349
www.xidix.com
 

-----Original Message-----
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Markus Gufler
Sent: Monday, April 05, 2004 12:43 PM
To: [EMAIL PROTECTED]
Subject: RE: [IMail Forum] March 2004 Spam Statistics

[This email took a suspicious route to arrive here; Suspected SPAM (4)]

 
Beside Scott's monthly stats showing up which test can catch more spam I
wondered what each single test can contribute to catch as many spam as
possible by having as few false positives as possible. (on a MTA
processing
legit messages from over 1000 mailboxes)

The calculation is based on the assumtion that the weighting system on
our
server will catch over 97% of spam by having around 0.01% of false
positives. (we review all hold spam between 100 and 200% of our hold
weight
and keep note of every requeued legit message)

So by parsing the logfiles I assume that the final weight is "correct"
and
so I know if a message is spam or legit. Now I look to the individual
result
of each test and if he has counted in the "right" direction.

For example
Final weight: 120 points   => it's spam
BASE64: 10 points          => right result
SPAMCOP: 10 points         => right result 
NOLEGITCONTENT: -5 points  => wrong result
...

The result is a table with 4 values for each single test:

Dark green:  right result for spam message
Light green: right result for legit message
Dark red:    wrong result for spam message
Light red:   wrong result for legit message

Beside the absolute numbers I've created also a diagram with relative
values
showing also for how much messages the test hasn't returned any result
(grey).

You can find the results on www.zcom.it/decludeupdater/spam_stats.htm

Notes: 
1.) Most tests per design can return only positive or only negative
results.
But there are also tests that can return both positive (voting for spam)
and
negative (voting for legit) results. So for example a IP4R usualy has
(or
should have) a positive result for spam and no result for legit
messages. So
it can't vote right for a legit message or wrong for a spam message.

2.) At the first moment the table maybe is a litle bit confusing.
Mouseover
the relative bars will show a short explanation.

3.) Briefly: the more green you can see the bether it is. Red is bad.

4.) If you can't see any red bar in the relative values note that this
means
that there are not enough false positives to show at least 1% in the
diagram. Maybe you can see some few false positives in the absolute
numbers.
Not very much tests are completely free of false positives like John
Tolmachoff's AUTOWHITE. (the only FP was caused by a spam-test message
containing a lot of tipical spam keywords)

5.) Based on my assumtion that the final weight is right it can happen
that
one or more tests are voting "right" but the final weight is not correct
(spam going trough the filters or legit message hold as false positive)
In
this case the tests with the right vote will earn a count for the red
values. But as I know that we have already a well balanced weighting
system
this wrong counts should be very rare.

Any comments or suggestions are welcome!
Hope this helps and you can understand my "english" :-)

Markus



To Unsubscribe: http://www.ipswitch.com/support/mailing-lists.html
List Archive:
http://www.mail-archive.com/imail_forum%40list.ipswitch.com/
Knowledge Base/FAQ: http://www.ipswitch.com/support/IMail/

---
[This E-mail scanned for viruses by Declude Virus
(http://www.declude.com)]


---
[This E-mail scanned for viruses by Declude Virus (http://www.declude.com)]


To Unsubscribe: http://www.ipswitch.com/support/mailing-lists.html
List Archive: http://www.mail-archive.com/imail_forum%40list.ipswitch.com/
Knowledge Base/FAQ: http://www.ipswitch.com/support/IMail/

Reply via email to