Op dinsdag 17-10-2006 om 16:12 uur [tijdzone -0600], schreef Quinn: > > Sounds like something a "disociated press" or other random text > > generator created. Perhaps you know about the monkeys with a > > typewriter? If you let a thousand monkeys press random keys on > > a typewriter, eventually one of them will by accident write a > > few lines from a Shakespeare sonnet. These random text > > generators work in a similar way. > > Interesting. I hadn't realized that was being used to actually do anything; > that's kind of cool. Not sure if these are coming from that sort of thing, > though. There are references to specific websites and publications > scattered around self-referentially. I really think they're somehow farming > real source and taking strings of variable length and just stringing them > together. It's a pretty good way to produce coherent-ish body text that > doesn't read as gibberish from an electronic standpoint. > > So, does this sort of thing defeat SpamBayes? They're making it through the > filter with great regularity, and have been for quite a while, so the > algorithms haven't figured it out in several hundred messages. Is there > _any_ way to deal with it, in SB or any other filter other than sender > black- or white lists?
I suppose there must be some way, because I don't get them. Your message with the example scored as unsure: X-Spambayes-Classification: unsure; 0.79 If it didn't include the typical spambayes mailing list headers, I'm sure it would have gotten an even higher spam score. -- Amedee _______________________________________________ [email protected] http://mail.python.org/mailman/listinfo/spambayes Check the FAQ before asking: http://spambayes.sf.net/faq.html
