Reindl Harald [mailto:h.rei...@thelounge.net] wrote:
>> This is a mail gateway for multiple companies. I'm not supposed to read
>> e-mails on that, or picking mails that can be used for learning ham
> how did you then manage 1.4 Mio ham-samples in your biased corpus
Looks like in this amavisd-spamassassin combo, it automatically learnt a lot of
ham (which weren't hams)
Feb 11 03:37:31 amavis: (20024-06) spam-tag, <no-re...@maiutazas.hu> ->
<someb...@company.hu>, No, score=-0.099 tagged_above=-9999 required=4
tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1,
I never configured autolearning, I assume it came with this centos setup. Man
spamassassin says, bayes_auto_learn has a default value of 1.
>> Without autolearning and without the help of the end-users, I can't build a
>> proper ham bayes database, can I?
> surely, or don't you and people around you which can help don't send and
> reveive mails?
I don't want to go in this "fight", but end-users have limited IT knowledge.
They are 100% outlook users (forwarding inline and attached always confuse
If I really want this, I need something user-proof one click solutions like
gmail's "spam" and "not spam" buttons which magically saves e-mails to the
proper technical mailbox (which is reviewed by the admins then trained SA).
With outlook users, exchange internal mta's, my options are limited.
So, if I understood correctly, you all agree on that bayesian database is
f***** up, let's start with a new one, autolearn turned off, and train SA from
the stratch both with ham and spam mails.