Hello Tony, Tuesday, March 16, 2004, 2:45:59 PM, you wrote:
>> If you have access to the /etc/mail/spamassassin/local.cf >> file (may be in a different directory according to how SA is >> called), then you can add the parameter >> > bayes_auto_learn 1 TB> Hey cool, done that now. Just looked at the headers of a message TB> received which says "autolearn=ham" This was a message from the SA TB> group funnily enough - presumably that is correct? Unless that message included spam samples, then no problem. I suggest you set your non-spam auto-learn threshold to -0.01 to make sure that spam that hits no rules is not accidentally learned as ham. TB> I managed to get sa-learn to work for a non-root account by deleting the TB> bayes* files as you suggested. TB> Presumably the bayes database applies then to any [EMAIL PROTECTED] for the TB> userid I run it under? My understanding is that each domain with a $HOME will have one $HOME/.spamassassin directory, and the bayes database built there will apply to all [EMAIL PROTECTED] for that domain. TB> I've had a look at your script and it's given me some ideas thanks - I have TB> written a script which will look for all files called learn_spam or TB> learn_ham and run sa-learn on them, then "empties" the files by removing TB> them and touching them (is there a better way?) cp /dev/null $file or cat </dev/null >$file are two methods I've used to empty files. TB> I know nothing about shell programming other than what I have picked up from TB> Bob's script and google, so forgive if it's a little rough around the edges TB> - is my first ever shell script!: TB> ==================================== TB> #!/bin/sh TB> if [ $1 -eq "d" ] ; then TB> SARGS="--showdots" TB> fi TB> echo "Learning SPAM" TB> for FILE in `find $HOME -name learn_spam -print` TB> do TB> echo "Processing $FILE" TB> sa-learn --spam --mbox $FILE $SARGS TB> rm $FILE TB> touch $FILE TB> done TB> echo "Learning HAM" TB> for FILE in `find $HOME -name learn_ham -print` TB> do TB> echo "Processing $FILE" TB> sa-learn --ham --mbox $FILE $SARGS TB> rm $FILE TB> touch $FILE TB> done TB> echo "Done" TB> ==================================== TB> Any obvious flaws there guys, or something I could do better? It *seems* TB> to work okay anyway. TB> Should I bung them all into one file first???? Looks good to me. I wouldn't cat them all into one file first, since my understanding is that the shorter/quicker sa-learn runs are better (less chance they'll block bayes update by incoming email and auto-learn). TB> The other thing is, how often should I run it - I've seen it mentioned TB> before that you need about 200 spams and 200 hams for sa-learn to be TB> effective - does this mean 200 _per run_ or that you need to have learned TB> about that number in total for it to be effective? TB> If the former, then presumably my script would be better off contatenating TB> the spam and ham files before passing them to a single run of sa-learn? I run my scripts once an hour. You need 200+ spams and 200+ hams before Bayes takes effect and starts applying its scores to your emails. It then remains effective unless you drop below those numbers (such as by deleting the database files and starting over). That has nothing to do with sa-learn. The more often sa-learn runs, the more current your bayes database is. Bob Menschel
