On Wed, 29 May 2013 15:16:58 -0400 Andrew Talbot wrote: > Hi there, RW- > > Thank you for your response. A lot of interesting points in there. The > issue with something like Bogofilter or its ilk is that it: > 1- Requires manual intervention from users (we don't have access to > the content of their messages)
That's the trouble with Bayes too. The difference is that SA has score-based autolearning, but autolearning is not very good at learning ham, you have to choose between learning low-scoring spam as ham, or not learning a representative selection of ham. > 2- Apparently doesn't scale well to huge client bases with all kinds > of diverse businesses. Our clients range from banking institutions to > employment agencies to ... ehh... purveyors of adult objects. So its > tough to find commonalities, and since we're so large, we can't > exactly have different user accounts for each. There's an element of truth there, but it equally well applies to Bayes. Actually Bogofilter is very similar to Bayes, the tokenization code is written independently, but the rest of the algorithms are virtually identical. There's no reason to think it any better or worse than Bayes at dealing with diverse mail. I can see why you wouldn't want to use it though.