pm...@email.it wrote:
Hi,
I've few question about the behavior of Bayes and SQL. Before the
questions, i've followed this tutorial
http://www200.pair.com/mecham/spam/debian-spamassassin-sql.html that
should be the same thing of this:
http://spamassassin.apache.org/full/3.0.x/dist/sql/README.bayes, my db
are updated constantly, so it should woks.
1- In the bayes_vars
http://192.168.1.36/phpmyadmin/sql.php?db=spamassassintoken=eea7fc1ed22ce035cad972e37fa36534table=bayes_varspos=0
table i've only a row for amavis user. Theoretically is it a good
choise to use only one db for all users of my domain? (if i've
understood well, spamassassin use this single db to store Bayes for
all users of my domain)
In theory, per-user is slightly more accurate than systemwide. However,
training is more important than granularity. So when it comes down to
it, unless you're ready to set up something where users can individually
report spam and nonspam (can be a bit tricky) you're probably better off
going with a single system-wide bayes database. At least this way if you
need to do some manual training, it's only one DB to train on and
everyone benefits.
2- How can i use single Bayes db for each users? Should i use
bayes_sql_override_username ? I don't know where to get the right
username.
You'd need to get amavis to pass this to spamassassin. I don't know
enough about amavis to know if this is supported or not. Generally most
MTA layer integrations don't, and most MDA integrations do, but there's
lots of exceptions. Amavis is a MTA integration, but it might be one of
the exceptions.
3- Every 10-15 seconds, the counts of ham_count or spam_count in
bayes_vars
http://192.168.1.36/phpmyadmin/sql.php?db=spamassassintoken=eea7fc1ed22ce035cad972e37fa36534table=bayes_varspos=0
table increase without that any users send or receave mails. So, the
behavior of spamassassin is to analize all mails presents in all my
users's Maildirs?
No. spamassasin has no concept that your user's maildirs even exist, it
will not scan them.
There are only 2 ways training occurs:
1) a message passes through SA during delivery, and gets auto-learned
due to the scoring criteria
2) someone (or some cronjob) calls sa-learn and explicitly feeds it mail.
And the only other way that the counts could update would be during a
journal sync, which occurs only during message processing or calls to
sa-learn. (the exact triggers are slightly different, but from a
high-level view they're more-or-less the same.).
It seems strange you're seeing the counts increase without any incoming
mail... Are you *positive* nothing is arriving, or recently arrived and
is just finishing up being processed by SA?
Thanks :)
Marco