there a couple rules you can use that do a good job of catching this

rawbody     HN_WORDWORD10
/(?:\b(?!=:q(?:from|even|more|this|that|were|with)\b)[a-z]{4,12}[.,:;'!?-]?\
s+){10}/
describe    HN_WORDWORD10  LOCAL: string of 10+ random words
score       HN_WORDWORD10  .5

rawbody     HN_WORDWORD15
/(?:\b(?!=(?:from|even|more|this|that|were|with)\b)[a-z]{4,12}[.,:;'!?-]?\s+
){15}/
describe    HN_WORDWORD15  LOCAL: string of 15+ random words
score       HN_WORDWORD15  2.5

rawbody     HN_WORDWORD30
/(?:\b(?!=(?:from|even|more|this|that|were|with)\b)[a-z]{4,12}[.,:;'!?-]?\s+
){30}/
describe    HN_WORDWORD30  LOCAL: string of 30+ random words
score       HN_WORDWORD30  5

> -----Original Message-----
> From: Jamie Penman-Smithson [mailto:[EMAIL PROTECTED]
> Sent: March 10, 2004 10:41 AM
> To: [EMAIL PROTECTED]
> Subject: Spammer's new tricks
> 
> 
> Recently, I've seen a lot of spam with either a lot of random words
> appended to the bottom, or even whole unrelated paragraphs from news
> reports or other random rubbish. I assume this is meant to 
> mess up bayes
> filtering...
> 
> I doubt I'm the only one seeing this phenomenon. I'm interested as to
> what kind of effect this will have, if any, on SA's scoring/bayes
> analysis of mail?
> 
> -j
> 
> -- 
> -jamie <[EMAIL PROTECTED]> | spamtrap: [EMAIL PROTECTED]
>  w: http://silverdream.org | p: [EMAIL PROTECTED]
>  pgp key @ http://silverdream.org/~jps/pub.key
>  15:30:01 up 7 days, 50 min, 13 users,  load average: 0.59, 0.49, 0.35
> 

Reply via email to