Hello Code, On 13 Jul 2004 at 11:44:25 -0700 GMT [20:44 CEST] you wrote:
C2> I've noticed spam, now, often contains several rows of meaningless, C2> random words, like: C2> "annul cripple jacobean hater metric deal prophesy diversify final" C2> I assume this is an attempt to foil Bayesian spam filters like C2> BayesIt. Yes. It doesn't work though. C2> If Bayesian filters look for certain recurring words C2> identified as common junk mail words to measure spamminess, it looks C2> like including an abundance of non-junk mail words will allow this C2> spammer technique to bypass the filter. No. Spammers don't know how you legitimate mail looks like. The words they include probably never occured in your legitimate mail or at least where no markers for it. They will most likely have neutral spam probabilities. But there will still be words which mark the mail as spam. Bayesit can use that to recognize it. Even if by chance they include a word that has a very low spam properbility it will most likely not be enough for the spam to come through as legitimate mail. Those words may even become markers for spam because they appear only in those spams but not in your legitimate mails. C2> Any training suggestions for BayesIt? If bayesit misses a spam tell it. Over here it recognizes those mails just fine. -- Cheers, Andre :andre: "I'm all in favor of keeping dangerous weapons out of the hands of fools. Let's start with typewriters." ________________________________________________ Current version is 2.11.02 | 'Using TBUDL' information: http://www.silverstones.com/thebat/TBUDLInfo.html

