We've been using this with great success:

uri GEOCITIES_SPAM
m'^https?://[a-z]+\.geocities\.com/([a-z]+/)+[?][a-z]+'i

Andrew Hoying



                                                                           
             "Maurice Lucas"                                               
             <[EMAIL PROTECTED]                                             
             nl>                                                        To 
                                       "Loren Wilton"                      
             10/10/2005 08:21          <[EMAIL PROTECTED]>, "Matthew   
             AM                        Newton" <[EMAIL PROTECTED]>      
                                                                        cc 
                                       <users@spamassassin.apache.org>     
                                                                   Subject 
                                       Re: Explosion in uk.geocities.com   
                                       spam                                
                                                                           
                                                                           
                                                                           
                                                                           
                                                                           
                                                                           




Matthew Newton wrote:
> On Sat, Oct 08, 2005 at 10:01:22PM -0700, Loren Wilton wrote:
>>> They use html and tables very smart, thus avoiding Bayes rules.
>>> Basically it is an invisible tables, using one row and several
>>> columns. The first column contains the first letter of every line,
>>> separated by "<BR>" and optionally some style-tags (b, i, etc.).
>>> Next column contains several more characters for each line, etc.
>>
>> Leo.  There are a good 9 or 10 variations on this now.  The SARE
>> rulesets have a number of rules that catch many of these, though not
>> all of them.
>
> On the assumption that "normal" URLs don't use the construct /? in
> them, and especially at geocities (are CGI scripts even allowed
> there?) how about the following?
>
> full      UOLCC_UKGEO
>
/http:\/\/uk.geocities.com\/[A-Z]?[a-z]{2,20}_[A-Z]?[a-z]{2,20}(?:_[A-Z]?[a-z]{2,20})?\d{0,4}\/\?[\w=\.]{3}/

> describe  UOLCC_UKGEO UK Geocities exploitation
> score     UOLCC_UKGEO 4.0
>
> I've been testing this for a couple of weeks now, and have had no
> complaints yet (but I do not have a corpus of spam to test it
> with, though, so can't be too sure).
>
> It could possibly also be condensed to the following (completely
> untested):
>
> full      UOLCC_UKGEO
> /http:\/\/..\.geocities\.com\/[A-Za-z0-9_]{2,40}\/\?[\w=\.]{3}/

I saw somebody else use
uri  UK_GEOCITIES   m'^http://uk\.geocities\.com\b'i
describe UK_GEOCITIES Body contains spammed domain
score   UK_GEOCITIES 3.0
uri  MSN_SPACES  m'^http://spaces\.msn\.com\/members\b'i
describe MSN_SPACES Body contains spammed domain
score   MSN_SPACES 3.0
uri  IT_GEOCITIES   m'^http://it\.geocities\.com\b'i
describe IT_GEOCITIES Body contains spammed domain
score   IT_GEOCITIES 3.0

PLEASE NOTE: I haven't used it myself so I don't know the FP count of these

rules

With kind regards,
Met vriendelijke groet,

Maurice Lucas
TAOS-IT



Reply via email to