Joe Emenaker <[EMAIL PROTECTED]> writes: > But Bayes keys on specific words, not patterns, and the spammers have, > at their disposal, a practically infinite number of permutations (making > for arduous traning of Bayes) which all fit a certain pattern (lending > itself to adding one SA rule).
Nah. Check your Bayes database. I bet a lot of them have counts higher than one. That means spammers reused the words. Nevermind all of the stuff in the message that they couldn't obfuscate that way (like URIs). > Maybe he was using Bayes but it hadn't been trained on those particular > munges of those particular words. Maybe it was caught... maybe it > wasn't. Regardless, he's looking for a rule to make SA more accurate. Oh, that rule is called 3.0.0. :-) Literally... there's a new rule that searches for randomized text. > In my case, I've turned off all of the auto-learning of my Bayes db, > and I carefully train it by hand with the spam and ham that I sort > through. Most ham gets a score of less than zero and most spam gets a > score of about 4 or more. Still, however, there's some > overlap.... which means that SA can still use some help becoming more > accurate, which seems to be what this guy is after. As you probably know, network tests help a lot with false negatives. There are a lot of new ones in SpamAssassin 3.0.0-pre1 too. Daniel -- Daniel Quinlan http://www.pathname.com/~quinlan/
