Daniel Quinlan wrote:
Joe Emenaker <[EMAIL PROTECTED]> writes:But Bayes keys on specific words, not patterns, and the spammers have, at their disposal, a practically infinite number of permutations (making for arduous traning of Bayes) which all fit a certain pattern (lending itself to adding one SA rule).
That was my first thought as well, but then you need to do it for each drug or each word that might be "stuck-keyed". However, when I checked my Bayes db, I found lots of words like these:
answeeeer
emaaaail
[...]
Yeah, but that's why Bayes works. By trying to evade using ordinary words, the spammer makes it only worse for themselves!
... except we all already know what he's talking about. We've all gotten this kind of spam and we know what it looks like.So... this would turn into a never-ending reactive "whack-a-mole" job.
And, also, when looking at the original message, I noticed that the
sender asked for a "generic" rule that did it.....
Sure, but nobody asked him to show the original spam. It's kinda silly to devote a lot of time writing rules without knowing more,
startingMaybe he was using Bayes but it hadn't been trained on those particular munges of those particular words. Maybe it was caught... maybe it wasn't. Regardless, he's looking for a rule to make SA more accurate.
with whether the spam was missed at all. Maybe it was missed, but was
he using Bayes?
In my case, I've turned off all of the auto-learning of my Bayes db, and I carefully train it by hand with the spam and ham that I sort through. Most ham gets a score of less than zero and most spam gets a score of about 4 or more. Still, however, there's some overlap.... which means that SA can still use some help becoming more accurate, which seems to be what this guy is after.
- Joe
