On Thu, 2004-01-29 at 04:28, Diego Zamboni wrote:
> - Long sequences of random dictionary words in their messages, which
> perhaps make it look more "normal" to filters.

I use bogofilter (a bayesian filter [only]). When the
heap-of-random-dictionary-words technique cropped up, I was really
worried - it seemed a good workaround. For a while I started getting
15-20% false negatives.

I thought I'd have to ditch and go to a full blown SpamAssassin setup,
but I faithfully trained for a week or two, and suddenly, my false
negatives are right back down to 1-2 per 1000.

My guess [this is entirely unscientific] is that it backfired on them.
The dictionary is relatively big, but the set of words commonly used is
*really* small in comparison. Because they use words that I and my
correspondents *never* use, the score on  uncommon words (take
"lanthanide" and "dispensary". Who are they kidding?) goes up, and they
become clear markers for spam.

[I wonder how many people are spam blocking this thread? :)]

AfC

-- 
Andrew Frederick Cowie
Operational Dynamics Consulting Pty Ltd

Australia: +61 2 9977 6866  North America: +1 646 472 5054

http://www.operationaldynamics.com/

--
[EMAIL PROTECTED] mailing list

Reply via email to