RE: Scoring Hoaxes

Chris Thielen 20 May 2004 20:26:16 -0000

On Thu, 2004-05-20 at 15:06, E. Falk wrote:
snip
> What concerns me is that a hoax ruleset has some serious drawbacks 
> (which have already been discussed). It would be a ruleset entirely 
> unrelated to spam, but rather focused entirely on a user's personal 
> mail. False positives would be enough of a concern, but even accurate 
> positives have their problems!
> 
snip
> A newbie with AWL, Bayes auto-learning, and a hoax ruleset will very 
> soon see his or her e-mail system degenerate into a massive mess.
> 
> I think my main point is this... treating hoaxes as spam is a bad idea. 
> An ideal add-on or improvement to SpamAssassin would be to have some way 
> to deal with unwanted personal e-mails - but spam is bulk and hoaxes are 
> personal. Those two don't mix well.


Just to throw in my ($1/50):

For the reasons cited already, I also couldn't recommend using a regular
SpamAssassin installation to filter hoaxes.  One should fully understand
the possible side effects it would have on accuracy of filtering "real"
spam and (more importantly) not filtering ham.

I imagine bayes could easily be trained to recognize hoaxes using the
body content alone; headers would probably be far less useful since
they're coming from legitimate sources.  It's easy to envision either a
separate bayesian filter working parallel to SA that's trained only to
recognize hoaxes or maybe a second SA installation (can you say
"overhead"?).



*light bulb*  On second thought, can't we use rule cflags to do just
this type of check without invalidating bayes and/or AWL?


-- 
Chris Thielen

Easily generate SpamAssassin rules to catch obfuscated spam phrases
(0BFU$C/\TED SPA/\/\ P|-|RA$ES): http://www.sandgnat.com/cmos/

Keep up to date with the latest third party SpamAssassin Rulesets:
http://www.exit0.us/index.php/RulesDuJour

RE: Scoring Hoaxes

Reply via email to