On Thu, 2004-05-20 at 15:06, E. Falk wrote: snip > What concerns me is that a hoax ruleset has some serious drawbacks > (which have already been discussed). It would be a ruleset entirely > unrelated to spam, but rather focused entirely on a user's personal > mail. False positives would be enough of a concern, but even accurate > positives have their problems! > snip > A newbie with AWL, Bayes auto-learning, and a hoax ruleset will very > soon see his or her e-mail system degenerate into a massive mess. > > I think my main point is this... treating hoaxes as spam is a bad idea. > An ideal add-on or improvement to SpamAssassin would be to have some way > to deal with unwanted personal e-mails - but spam is bulk and hoaxes are > personal. Those two don't mix well.
Just to throw in my ($1/50): For the reasons cited already, I also couldn't recommend using a regular SpamAssassin installation to filter hoaxes. One should fully understand the possible side effects it would have on accuracy of filtering "real" spam and (more importantly) not filtering ham. I imagine bayes could easily be trained to recognize hoaxes using the body content alone; headers would probably be far less useful since they're coming from legitimate sources. It's easy to envision either a separate bayesian filter working parallel to SA that's trained only to recognize hoaxes or maybe a second SA installation (can you say "overhead"?). *light bulb* On second thought, can't we use rule cflags to do just this type of check without invalidating bayes and/or AWL? -- Chris Thielen Easily generate SpamAssassin rules to catch obfuscated spam phrases (0BFU$C/\TED SPA/\/\ P|-|RA$ES): http://www.sandgnat.com/cmos/ Keep up to date with the latest third party SpamAssassin Rulesets: http://www.exit0.us/index.php/RulesDuJour
