Bob George <[EMAIL PROTECTED]> wrote: >Dan Melomedman <[EMAIL PROTECTED]> wrote: >> Pat Noordsij wrote: >>> I have one email that included 2 pages of text from Tom >>> Sawyer. >>> >>> It didn't get caught. >> >> There are also sentence-writing AI programs conveniently >> available for spammers. Finally they found a way to foil >> Bayesian filters. Congratulations. >> >> Welp, time to find a new anti-spam mechanism. What is it this >> time? > >Hey wait... this has come up repeatedly. With a well trained bayes, you should >still have good odds of catching these. There are enough words that will ONLY >show up in spam, PLUS THE OTHER characteristics of the messages to detect >"spammy" messages. And the add-on rule sets should add even more teeth as new >techniques evolve. > >I wouldn't give up just yet, but maybe modify how I train bayes.
I'm not saying Bayes isn't working most of the time, but it does seem possible to craft sentences that skew it strongly towards ham. The faked sentences in the original posting contained plenty of hammish words, but there were a few that would eventually end up as spam markers with training. For example, the improperly hyphenated "calms-down" and "stands-still", as well as "caw" and "binocycles". Pierre Thomson BIC
