On Feb 21, 2007, at 3:02 PM, Kris Deugau wrote:
-> (at least, once Bayes was part of SA <g>) feed missed spam back
into Bayes manually to complement the autolearning (which worked
pretty well for me, and without which I'd have very VERY little ham
learned at all).
I spent about a year training a good bayes corpus on one account, and
leaving bayes disabled on two others. The difference in spam caught
was a fraction of a percent, and when spammers started including
technical mailing list chatter into their bayes-busting e-mails I
started having lots of false positives on the bayes-enabled account.
It simply doesn't pay off.
Most third-party rules are scored to get spam over that threshold
of 5 largely because, IME, most people seem to be quite happy to
leave it at 5; if you're running a lower score, you WILL see FPs
unless you *drop* the scores on some of the heavier rules. I
probably saw at one point; what scores are these FPs getting on
your system?
7-9. The reason I run at 3.8 is because I have 0 - none, null, void
- FPs in between 3.8 and 5.0. The very few FPs I see are SPF
failures which I score fairly harshly, and that starts at 7.0.
I used to have low FPs on code segments until I relegated the
chickenpox rulesets to 0.1 each. In fact, I plan to run a ruletest
because I never see chickenpox on real spam, so I'm pretty sure those
rules are useless now.
I've had ONE customer that I ended up dropping the threshold to
4.8, because they kept getting spam that was *just* under 5. (I
think I bumped it back up to 4.9 because of FPs. *sigh*) IIRC
they're also the only customer that regularly seems to get pornspam
(tagged or otherwise).
I can't imagine running at 4.8. A quick check confirms that greater
than 600 spam messages would have hit this mailbox today. That's
just this one, and nevermind hostmaster/webmaster/etc that get nailed
harshly.
I don't have that kind of time.
--
Jo Rhett
Net Consonance : consonant endings by net philanthropy, open source
and other randomness