On Sun, Nov 15, 2009 at 08:53, rich...@buzzhost.co.uk
<rich...@buzzhost.co.uk> wrote:
> On Sun, 2009-11-15 at 03:14 -0500, Warren Togami wrote:
>> http://mail-archives.apache.org/mod_mbox/spamassassin-users/200910.mbox/%3c4ad11c44.9030...@redhat.com%3e
>> Compare this report to a similar report last month.
>>
>> http://wiki.apache.org/spamassassin/NightlyMassCheck
>> The results below are only as good as the data submitted by nightly
>> masscheck volunteers.  Please join us in nightly masschecks to increase
>>   the sample size of the corpora so we can have greater confidence in
>> the nightly statistics.
>>
>> http://ruleqa.spamassassin.org/20091114-r836144-n
>> Spam 131399 messages from 18 users
>> Ham  189948 messages from 18 users
>>
>> ============================
>> DNSBL lastexternal by Safety
>> ============================
>> SPAM%    HAM%    RANK RULE
>> 12.8342% 0.0021% 0.94 RCVD_IN_PSBL *
>> 12.3053% 0.0026% 0.94 RCVD_IN_XBL
>> 31.2499% 0.0827% 0.87 RCVD_IN_ANBREP_BL *2
>> 80.2578% 0.1485% 0.86 RCVD_IN_PBL
>> 27.1836% 0.1985% 0.79 RCVD_IN_SORBS_DUL
>> 19.8213% 0.1785% 0.79 RCVD_IN_SEMBLACK *
>> 90.9360% 0.3854% 0.77 RCVD_IN_BRBL_LASTEXT
>> 13.0564% 0.4838% 0.67 RCVD_IN_HOSTKARMA_BL *
>>
>> Commentary:
>> * PSBL and XBL lead in apparent safety.
>> * ANBREP was added after the October report and has made a surprisingly
>> strong showing in this past month.  ANBREP is currently unavailable to
>> the general public.  The list owner is thinking about going public with
>> the list, which I would encourage because they are clearly doing
>> something right.  It seems he would need a global network of automated
>> mirrors to be able to scale.  He would also need listing/delisting
>> policy clearly stated on a web page somewhere.
>> * SEMBLACK consistently has been performing adequately in safety while
>> catching a respectable amount of spam.  I personally use this
>> non-default blacklist.
>> * It is clear that the two main blacklists are Spamhaus and BRBL.  The
>> Zen combinatoin of Spamhaus zones is extremely effective and generally
>> safe.  BRBL has a high hit rate as well, with a moderate safety rating.
>> * HOSTKARMA_BL ranks dead last in safety for the past several weeks in a
>> row, while not being more effective against spam than PSBL, XBL or SEMBLACK.
>>
>> ===============================
>> HOSTKARMA_BL much better as URIBL
>> ===============================
>> SPAM%    HAM%    RANK RULE
>> 68.3651% 0.2806% 0.79 URIBL_HOSTKARMA_BL *
>>
>> Commentary:
>> While HOSTKARMA_BL is pretty unsafe as a plain DNSBL, it is surprisingly
>> effective as a URIBL.  This is curious as it seems it was not designed
>> to be used as a URIBL.  In any case as long our masschecks show good
>> statistics like this, I will personally use this on my own spamassassin
>> server.
>>
>> =========================
>> SPAMCOP Dangerous?
>> =========================
>> SPAM%    HAM%    RANK RULE
>> 17.4225% 2.6076% 0.56 RCVD_IN_BL_SPAMCOP_NET *
>>
>> Commentary:
>> Is Spamcop seriously this bad?  It consistently has shown a high false
>> positive rates in these past weeks.  Was it safer than this in the past
>> to warrant the current high score in spamassassin-3.2.5?
>>
>> Warren Togami
>> wtog...@redhat.com
>
> Is it not a bit flawed to do the metrics on volunteer submissions, given
> the Spamhaus has is said to have a small army of them? It means the data
> cannot be relied upon as any kind of sensible comparison.

please explain.  How would you suggest measuring false positives?

-- 
--j.

Reply via email to