Steve wrote:
On Monday 30 May 2005 19:25, mouss wrote:
run SA from amavisd, and run sa-learn with the same uid as amavisd.
Okay, ignore my previous message. I'm working on getting amavisd to run SA.
Currently, amavisd seems to be running as user 'vscan' (UID 65). How do I run
sa-learn as this user and where would it put the bayesian DB?
as root, run
su vscan -c sa-learn ....
make sure the message file or dir is read by root, not by vscan since in
a correct setup vscan doesn't have read permission here. so use
something like (I'm typing over my nose here. check before use):
for f in `find $spamfolder -type f`; do
(su $amavisuser -c sa-learn --spam ...) < $f
mv $f $killfolder
done
this assumes a maildir setup. mbox requires more work...
As you can see, I'm new to this stuff, so help is appreciated.
A simple setup is to use imap and maildir format (courier-imap or
dovecot). then tell your users to create some folders for sa-learn. for
instance, they create Junk/Miss to move the missed messages and
Junk/Innocent to copy legit messages classified as spam. feel free to
create other folders for other things.
then have a script that runs sa-learn as vscan but again, the mail file
isn't readbale by vscan, so you'll need to read the maildir file by
file and pass the output to sa-learn. while there is no problem
chmod-ing the spam folder, this is less obvious for the ham folder.
of course, all this stuff assumes you want to use a site-wide bayes db.
you need to be careful when using the classification of your users
(unless you trust them to do the right classification). on the other
hand, site-wide has the advantages of simplicity (only one db to care
for), fewer storage, fewer cpu/ram (multi-rcpt mail gets parsed once),
disposition coherence (in the case of multi-rcpt mail, the message is
either spam or ham, it is not spam for a group and ham for others. the
latter may cause problems like "but you've got that mail like I
did..."), faster learning (gets more messages), and "spam-experience"
sharing between the users. Now, a lot of people here (and google) will
tell you the benefits of per-user db, so I'll stop here. It really
depends on your situation.