Joe Emenaker <[EMAIL PROTECTED]> writes:

> But Bayes keys on specific words, not patterns, and the spammers have, 
> at their disposal, a practically infinite number of permutations (making 
> for arduous traning of Bayes) which all fit a certain pattern (lending 
> itself to adding one SA rule).

Nah.  Check your Bayes database.  I bet a lot of them have counts higher
than one.  That means spammers reused the words.  Nevermind all of the
stuff in the message that they couldn't obfuscate that way (like URIs).

> Maybe he was using Bayes but it hadn't been trained on those particular 
> munges of those particular words. Maybe it was caught... maybe it 
> wasn't. Regardless, he's looking for a rule to make SA more accurate.

Oh, that rule is called 3.0.0.  :-)

Literally... there's a new rule that searches for randomized text.
 
> In my case, I've turned off all of the auto-learning of my Bayes db,
> and I carefully train it by hand with the spam and ham that I sort
> through.  Most ham gets a score of less than zero and most spam gets a
> score of about 4 or more. Still, however, there's some
> overlap.... which means that SA can still use some help becoming more
> accurate, which seems to be what this guy is after.

As you probably know, network tests help a lot with false negatives.
There are a lot of new ones in SpamAssassin 3.0.0-pre1 too.

Daniel

-- 
Daniel Quinlan
http://www.pathname.com/~quinlan/

Reply via email to