there a couple rules you can use that do a good job of catching this
rawbody HN_WORDWORD10
/(?:\b(?!=:q(?:from|even|more|this|that|were|with)\b)[a-z]{4,12}[.,:;'!?-]?\
s+){10}/
describe HN_WORDWORD10 LOCAL: string of 10+ random words
score HN_WORDWORD10 .5
rawbody HN_WORDWORD15
/(?:\b(?!=(?:from|even|more|this|that|were|with)\b)[a-z]{4,12}[.,:;'!?-]?\s+
){15}/
describe HN_WORDWORD15 LOCAL: string of 15+ random words
score HN_WORDWORD15 2.5
rawbody HN_WORDWORD30
/(?:\b(?!=(?:from|even|more|this|that|were|with)\b)[a-z]{4,12}[.,:;'!?-]?\s+
){30}/
describe HN_WORDWORD30 LOCAL: string of 30+ random words
score HN_WORDWORD30 5
> -----Original Message-----
> From: Jamie Penman-Smithson [mailto:[EMAIL PROTECTED]
> Sent: March 10, 2004 10:41 AM
> To: [EMAIL PROTECTED]
> Subject: Spammer's new tricks
>
>
> Recently, I've seen a lot of spam with either a lot of random words
> appended to the bottom, or even whole unrelated paragraphs from news
> reports or other random rubbish. I assume this is meant to
> mess up bayes
> filtering...
>
> I doubt I'm the only one seeing this phenomenon. I'm interested as to
> what kind of effect this will have, if any, on SA's scoring/bayes
> analysis of mail?
>
> -j
>
> --
> -jamie <[EMAIL PROTECTED]> | spamtrap: [EMAIL PROTECTED]
> w: http://silverdream.org | p: [EMAIL PROTECTED]
> pgp key @ http://silverdream.org/~jps/pub.key
> 15:30:01 up 7 days, 50 min, 13 users, load average: 0.59, 0.49, 0.35
>