On Fri, 23 Feb 2018, Amir Caspi wrote:

Hi all,

        So, I've been trying to tweak my setup and noticed that VERY few of my 
emails are being autolearned as spam, even when their spam threshold is far above 
the autolearn threshold.  The threshold is set to 12; I just saw a spam with score 
>25 not being autolearned.

        Are there rules that prevent autolearning?  If so, why?  If a spam 
scores really high because it hits (let's say) 10 or more rules, but just one 
of those rules is enough to prevent autolearning, that seems overly 
restrictive, no?

        For example, for one of my users, out of about 650 spams received in 
the last month, only 10 have been autolearned.  For another user, only 12 of 
nearly 1400.  That seems like a very low percentage, and clearly some 
high-scoring spams are not being auto-learned.

Any explanation is appreciated!


--- Amir

If you read the spamassassin documentation about Bayes auto-learning you will see that there are several conditions that must be satisfied.

For example, there are some types of rules which aren't considered at all when computing the auto-learning threshold score (such as white/black list scores or rules tagged with the noautolearn tflag or the actual Bayes score itself).

Of the types of rules which are allowed, at least 3 of those points must come from header type rules and at least 3 of those points must come from body type rules.

So a spam can have 100 points from a blacklist and not auto-learn.

It could have 20 points from a whole bunch of body rules but if it only hit 2
points via header rules it still will not auto-learn.

Another possible factor, if you have "bayes_auto_learn_on_error" enabled, then autolearn will be skipped if Bayes already agrees with the condition of the message. IE: if the message is already classifed as BAYES_99 then it won't bother auto-learning it as yet another high-ranking spam.

What I usually see in auto-learned spam is that they hit a number of network RBL rules (spamhaus, SORBS, etc) and a number of body rules such as RAZOR, URIBLS, etc.

Dave Funk                                  University of Iowa
<dbfunk (at) engineering.uiowa.edu>        College of Engineering
319/335-5751   FAX: 319/384-0549           1256 Seamans Center
Sys_admin/Postmaster/cell_admin            Iowa City, IA 52242-1527
#include <std_disclaimer.h>
Better is not better, 'standard' is better. B{

Reply via email to