Hello Eric,

Thursday, February 5, 2004, 4:40:28 PM, you wrote:

EF> but the typo (below) saying "with with" is a good identifier for this
EF> particular program.

Agreed. "with with" hits 3 spam here, no ham.
Just developed this rule, which I'll be testing tonight:
header    RM_hr_WithWith         Received =~ / with with /
describe  RM_hr_WithWith         Spam identified by typo in received header
score     RM_hr_WithWith         1.000  # type=spamp - 

EF> The from/reply-to address made from this program is always a randomly
EF> generated username with a valid domain.  The username seems to be 6 or more
EF> characters, often with few vowels.  Here's a few examples:

EF> I wonder if a low/med scoring rule can be created to look for usernames of 6
EF> or more alpha only chars with large groups (4+) of back-to-back consonants?
EF> Sticking with 6 or more chars should avoid simple abbreviations like
EF> [EMAIL PROTECTED] or [EMAIL PROTECTED], but be more successful with
EF> [EMAIL PROTECTED]

I use:
header    RM_fl_ConsWord6s        From =~ /\b[bcghjklmnpqrtvwxz]{6,20}\b/
describe  RM_fl_ConsWord6s        To   contains word consisting of consecutive 
consonants
score     RM_fl_ConsWord6s        3.000  # 460s/1h of 97268 corpus 
(79437s/17831h) 01/24/04
header    RM_fl_ConsWord9         From =~ /\b[bcghjklmnpqrstvwxz]{9,20}\b/
describe  RM_fl_ConsWord9         From contains word consisting of consecutive 
consonants
score     RM_fl_ConsWord9         3.000  # type=spamp - 137s/0h of 97268 corpus 
(79437s/17831h) 01/24/04

Note that the 6-consonant test has had "s" removed to cut down on ham
hits.

Bob Menschel



Reply via email to