Re: Help with rule for geocities spam

2006-05-23 Thread Daryl C. W. O'Shea

On 5/23/2006 2:51 AM, Benny Pedersen wrote:

http://wiki.apache.org/spamassassin/WebRedirectPlugin



there is a slight config error on the page

[WWW] http://people.apache.org/~dos/sa-plugins/3.1/WebRedirect.cf
[WWW] http://people.apache.org/~dos/sa-plugins/3.1/WebRedirect.pm

in the cf file the loadplugin should realy be in a pre file and commented out 
in the cf file

just be aware of not use loadplugin in a cf file

i have made local.pre for plugins that are 3dr party


It's only a problem if you want to add more rules, that rely on the 
plugin, in a file that comes before WebRedirect.cf alphabetically.  Of 
course anyone who would add their own rules using the interface provided 
by the plugin should know enough to load the plugin before their rules.


It's a fairly safe trade-off, since not many people will add their own 
rules anyway, between providing two files or three.



Daryl


Help with rule for geocities spam

2006-05-22 Thread Kenneth Porter
I just grepped my entire mail hierarchy for .geocities.com and the only 
legitimate stuff I see either uses the www or uk subdomains. How can I 
write a rule that matches on that? If it were just one subdomain I could 
write one rule for all subdomains and one for just the one subdomain and 
use a negative score for the latter to match the positive score for the 
all-subdomain rule. But how do I handle two good subdomains?


RE: Help with rule for geocities spam

2006-05-22 Thread Bowie Bailey
Kenneth Porter wrote:
 I just grepped my entire mail hierarchy for .geocities.com and the
 only legitimate stuff I see either uses the www or uk subdomains. How
 can I write a rule that matches on that? If it were just one
 subdomain I could write one rule for all subdomains and one for just
 the one subdomain and use a negative score for the latter to match
 the positive score for the all-subdomain rule. But how do I handle
 two good subdomains? 

I assume you mean www.geocites.com and uk.geocities.com, right?

Try this:

/(?:www|uk)\.geocities\.com/

Add other anchors as appropriate...

-- 
Bowie


RE: Help with rule for geocities spam

2006-05-22 Thread Kenneth Porter
On Monday, May 22, 2006 12:28 PM -0400 Bowie Bailey [EMAIL PROTECTED] 
wrote:



I assume you mean www.geocites.com and uk.geocities.com, right?

Try this:

/(?:www|uk)\.geocities\.com/

Add other anchors as appropriate...


Doh! That was too easy! :P

BTW, in my corpus the only legit use of other subdomains are from samples a 
year or more in the past.





Re: Help with rule for geocities spam

2006-05-22 Thread Michael Monnerie
On Montag, 22. Mai 2006 18:28 Bowie Bailey wrote:
  /(?:www|uk)\.geocities\.com/

Or the full line could be:
uri  ZMIgeocitiesGOOD m{(?:www|uk)\.geocities\.com}
describe ZMIgeocitiesGOOD probably good geocities site
scoreZMIgeocitiesGOOD -1.2

or whatever score you want to give them.

mfg zmi
-- 
// Michael Monnerie, Ing.BSc-  http://it-management.at
// Tel: 0660/4156531  .network.your.ideas.
// PGP Key:   lynx -source http://zmi.at/zmi3.asc | gpg --import
// Fingerprint: 44A3 C1EC B71E C71A B4C2  9AA6 C818 847C 55CB A4EE
// Keyserver: www.keyserver.net Key-ID: 0x55CBA4EE


pgp0LKarl3svE.pgp
Description: PGP signature


Re: Help with rule for geocities spam

2006-05-22 Thread Kenneth Porter
On Monday, May 22, 2006 7:24 PM +0200 Michael Monnerie 
[EMAIL PROTECTED] wrote:



Or the full line could be:
uri  ZMIgeocitiesGOOD m{(?:www|uk)\.geocities\.com}
describe ZMIgeocitiesGOOD probably good geocities site
scoreZMIgeocitiesGOOD -1.2

or whatever score you want to give them.


Does a uri rule count once per instance or for all matching uris? If, for 
instance, I have that rule and one matching *all* subdomains with a +1.2, 
does a spammer just have to insert a good uri to nullify the score for 
the bad one?


Alternatively, is there regex syntax to match all patterns *except* the one 
given? Can I somehow express all geocities.com subdomains except www and 
uk as a regex?





RE: Help with rule for geocities spam

2006-05-22 Thread Bowie Bailey
Kenneth Porter wrote:
 On Monday, May 22, 2006 7:24 PM +0200 Michael Monnerie
 [EMAIL PROTECTED] wrote:
 
  Or the full line could be:
  uri  ZMIgeocitiesGOOD m{(?:www|uk)\.geocities\.com}
  describe ZMIgeocitiesGOOD probably good geocities site
  scoreZMIgeocitiesGOOD -1.2
  
  or whatever score you want to give them.
 
 Does a uri rule count once per instance or for all matching uris? If,
 for instance, I have that rule and one matching *all* subdomains with
 a +1.2, does a spammer just have to insert a good uri to nullify
 the score for the bad one?

The URI rule just says does this exist in the message?  So it will
only hit once per message.  And yes, spammers could take advantage of
this rule.  This is why there are not many negative scoring rules in
SA.

 Alternatively, is there regex syntax to match all patterns *except*
 the one given? Can I somehow express all geocities.com subdomains
 except www and uk as a regex?

That is a bit trickier because Perl does not currently support
variable length look-behinds.  But you can get around that by using
two separate look-behinds like this:

/(?!\bwww)(?!\buk)\.geocities\.com/

Note that you have to anchor both options separately.

-- 
Bowie


RE: Help with rule for geocities spam

2006-05-22 Thread Matthew.van.Eerde
Bowie Bailey wrote:
 Kenneth Porter wrote:
 Alternatively, is there regex syntax to match all patterns *except*
 the one given? Can I somehow express all geocities.com subdomains
 except www and uk as a regex?
 
 That is a bit trickier because Perl does not currently support
 variable length look-behinds.  But you can get around that by using
 two separate look-behinds like this:
 
 /(?!\bwww)(?!\buk)\.geocities\.com/

In this specific case, this might suffice:
/[^wu][^wk]\.geocities\.com/i

... but this pattern does not generalize well.

-- 
Matthew.van.Eerde (at) hbinc.com   805.964.4554 x902
Hispanic Business Inc./HireDiversity.com   Software Engineer


RE: Help with rule for geocities spam

2006-05-22 Thread Bowie Bailey
[EMAIL PROTECTED] wrote:
 Bowie Bailey wrote:
  Kenneth Porter wrote:
   Alternatively, is there regex syntax to match all patterns
   *except* the one given? Can I somehow express all geocities.com
   subdomains except www and uk as a regex?
  
  That is a bit trickier because Perl does not currently support
  variable length look-behinds.  But you can get around that by using
  two separate look-behinds like this:
  
  /(?!\bwww)(?!\buk)\.geocities\.com/
 
 In this specific case, this might suffice:
 /[^wu][^wk]\.geocities\.com/i

This is probably a less expensive regex, but it does not match quite
the same thing.  This will match any subdomain that does not end in
ww, wk, uw, or uk.

For instance, it will not match on squawk.geocities.com.

 ... but this pattern does not generalize well.

True, but neither does mine once you get past two or three
alternatives.

-- 
Bowie


Re: Help with rule for geocities spam

2006-05-22 Thread Kenneth Porter
As it turns out, I had a SARE rule installed that should catch these, but I 
found some spams leaking through due to the insecure dependency bug (bug 
3838), even though I'm running Perl 5.8.3. I'm applying Daryl C. W. 
O'Shea's patch for that bug.


Here's the SARE rule:

http://www.rulesemporium.com/rules/70_sare_specific.cf

(Look for __SARE_SPEC_XXGEOCITIE)


Re: Help with rule for geocities spam

2006-05-22 Thread jdow

From: [EMAIL PROTECTED]

Bowie Bailey wrote:

Kenneth Porter wrote:

Alternatively, is there regex syntax to match all patterns *except*
the one given? Can I somehow express all geocities.com subdomains
except www and uk as a regex?


That is a bit trickier because Perl does not currently support
variable length look-behinds.  But you can get around that by using
two separate look-behinds like this:

/(?!\bwww)(?!\buk)\.geocities\.com/


In this specific case, this might suffice:
/[^wu][^wk]\.geocities\.com/i

... but this pattern does not generalize well.

 jdow  meh - simply use the easy rule for either www or uk.
Give it a score of 0.001 if you want to monitor it. Then use it
in a meta rule with a /geocities.com/ rule. If it is the latter
and not the former give it 1000 points or whatever. If it is
the latter AND the former be nice and only give it 999 + 1 points.

{^_-}


Re: Help with rule for geocities spam

2006-05-22 Thread jdow

From: Justin Mason [EMAIL PROTECTED]


Kenneth Porter writes:
As it turns out, I had a SARE rule installed that should catch these, but I 
found some spams leaking through due to the insecure dependency bug (bug 
3838), even though I'm running Perl 5.8.3. I'm applying Daryl C. W. 
O'Shea's patch for that bug.


Here's the SARE rule:

http://www.rulesemporium.com/rules/70_sare_specific.cf

(Look for __SARE_SPEC_XXGEOCITIE)


did it work?  if so, please add a report to that bug -- there
are still very few comments indicating success.  (although I don't
doubt that's just lack of comment, rather than a faulty patch.)


It is still working for me, Justin. I've removed my procmail double
tap work around that fed through a second time if the first time
failed to create markup.

{^_^}



Re: Help with rule for geocities spam

2006-05-22 Thread Daryl C. W. O'Shea

On 5/22/2006 6:14 PM, Kenneth Porter wrote:
As it turns out, I had a SARE rule installed that should catch these, 
but I found some spams leaking through due to the insecure dependency 
bug (bug 3838), even though I'm running Perl 5.8.3. I'm applying Daryl 
C. W. O'Shea's patch for that bug.


Here's the SARE rule:

http://www.rulesemporium.com/rules/70_sare_specific.cf

(Look for __SARE_SPEC_XXGEOCITIE)


Just because someone spelling my entire name right caught my attention...

If you've got the bandwidth and processing time to spare, you might as 
well get Yahoo! to serve up the spam sites they're hosting:


http://wiki.apache.org/spamassassin/WebRedirectPlugin


Daryl