On Wed, 11 Sep 2013 18:25:59 +0200
Mathieu R. wrote:

> Hello,
> 
> Sorry for posting on both list spamassassin and dovecot : my question
> is on dovecot antispam plugin, used to learn spamassassin with
> sa-learn.
> 
> I wonder if there is a way to confirme sa-learn is correctly feeded by
> the antispam plugin.
> ...
> and here is what i got in /tmp/sa-learn-pipe.log:
> 
> 10545-start (--spam)
> 10545-end
> 
> For me, it's working, but when i run sa-learn --backup, i just get
> this :
> 
> v       3       db_version # this must be the first line!!!
> v       0       num_spam
> v       0       num_nonspam
> 
> it's probably cause i'm using ***STANDARD-ANTI-UBE-TEST-EMAIL*** wich
> probably teach nothing to sa-learn,

It should still have been learned. Usually this kind of thing is due
to different invocations looking for the Bayes database in
different places.

IIWY I'd modify the script to run sa-learn with -D bayes and have
it dump stderr to a file. If you are attempting to use per unix user
databases it might be useful to log $HOME as well.


I'm sceptical that the Antispam plugin can learn enough ham this way.
As I understand it the only mail that gets learnt as ham will be
false-positives based on the overall spamassassin score, irrespective of
the Bayes result. Bayes needs (by default) 200 spams and hams to even
start classifying and much more for optimal results - I don't expect to
get 200 FPs in the rest of my life. Unless this is high volume server
with a shared database, I'd suggest either learning a few thousand hams
manually, or implementing an unsure folder. You can also mitigate the
problem by  autotraining with a high ham threshold, but then you
really need to be careful to move all spam to the spam folder.  


 

Reply via email to