------ Original Message ------ Received: Wed, 16 Feb 2005 12:31:49 AM EST From: Robert Menschel <[EMAIL PROTECTED]> To: FH <[EMAIL PROTECTED]>Cc: users@spamassassin.apache.org Subject: Re[4]: Care and feeding instructions for SpamAssassin?
> >> Next time you get one of those spam that sneaks through, run > >> > spamassassin -D <email >output 2>debug.out > > F> There must be a disconnect somewhere. I just did this w/ a "drugs > F> online" spam I just received. When it first came in it had a > F> rating of 1.9, I saved it as a file (not an mbox) on the server and > F> ran the above command and it reported a 12.5!!! > > What were the rule hit changes? Depending on the time between the > first scan and the second, some of that might have been due to network > tests having been taught the spam. The more time that passed, the more > likely such a score increase would be. Bayes also could have been > involved, since other emails could have increased the Bayes score. > It was less than 1/2 hour because I was experimenting w/ the commands and the new email came in so I decided to use that one ;) Initial email: X-Spam-Status: No, score=1.9 required=4.0 tests=BAYES_99 autolearn=no version=3.0.2 After running it through the spamassassin -D command: X-Spam-Status: Yes, score=12.5 required=4.0 tests=BAYES_99, RCVD_IN_BL_SPAMCOP_NET,URIBL_AB_SURBL,URIBL_OB_SURBL,URIBL_SC_SURBL, URIBL_WS_SURBL autolearn=no version=3.0.2 BTW isn't the default autolearn spam threshold supposed to be 12? Email after bounce: X-Spam-Status: No, score=0.4 required=4.0 tests=BAYES_60 autolearn=ham version=3.0.2 spamassassin -D after bounce: X-Spam-Status: Yes, score=7.9 required=4.0 tests=ALL_TRUSTED,BAYES_99, URIBL_AB_SURBL,URIBL_OB_SURBL,URIBL_SC_SURBL,URIBL_WS_SURBL autolearn=no version=3.0.2 Did I miss a switch somewhere since there seems to be more tests running/reported when I run it manually instead of when it runs through the system? BTW I don't know if it matters or not but I used the book "Anti-Spam Toolkit" as my reference guide when setting up the system. I also just ordered the ORA book so I'll give that a read through too. I'm mainly just curious about the above now. > > Auto-learning it as ham is IMO a problem. I think that auto-learning > anything with a positive score as ham is asking for trouble. I have my > ham auto-learn thresholds set at -2. (I have several negative scoring > rules specific to my domains.) > That's the bayes_auto_learn_threshold_nonspam right? I don't have it set in local.cf so I thought it would have been the default (.1 right?). I'm not sure why the .4 above was autolearned as ham?!? I just ran spamassassin --lint -D but didn't see a report of the threshold, should it have been in there? BTW using a script Matias Bergero sent me here's what the maillog has wrt autolearning: Since: Feb 14 03:10:07 learned ham: 1815 Learned spam: 298 > > By ~root/.spamassassin, do you mean each individual's root or home > directory, then a .spamassassin directory under that? And in your > config files, do you specify a Bayes database path? > Nope, just the root user has that dir/files. I didn't turn on the individual user preferences (I read a couple of places this wasn't recommended). From the /etc/mail/spamassassin/local.cf file: "bayes_path /var/spool/spamassassin/sa". > > If you also have these bayes files in /var/spool/spamassasin, then why > are they there? Are they being updated? I'm wondering whether you're > training the $HOME/.spamassassin/bayes_* files but filtering on a > central set of files. > AFAIK the only bayes files on the system are in /var/spool/spamassassin. I can only assume those are the ones that are being updated when I do a sa-learn. Is there any switch or other way on confirming this? I just did a "tail -f sa_journal" while doing a "sa-learn --spam ..." and nothing happened... Is it time to throw the reset switch and start from scratch? Thanks again for all the help :D