Re: [Mimedefang] regex filter unwanted words

2007-01-23 Thread WBrown
John Rudd wrote on 01/22/2007 06:17:48 PM:

 As many as you can fit.  But I would be very careful about it.  Plus, I 
 would make sure to use \b around the words, so that you're not getting 

 sub-string matches.  For example:
 
 \bsex\b  will match sex but not match Wesex.

I can't second this strongly enough!  I had a very *IRATE* user 
complaining about not receiving email from his boss.  Turns out he had 
created a rule in his mail client to block a certain four letter word and 
forgot about it.  The problem started when he added his title Programmer 
Analyst to his signature block and he stopped getting replies to his 
messages.
___
NOTE: If there is a disclaimer or other legal boilerplate in the above
message, it is NULL AND VOID.  You may ignore it.

Visit http://www.mimedefang.org and http://www.roaringpenguin.com
MIMEDefang mailing list MIMEDefang@lists.roaringpenguin.com
http://lists.roaringpenguin.com/mailman/listinfo/mimedefang


Re: [Mimedefang] regex filter unwanted words

2007-01-23 Thread Kelson

John Rudd wrote:

if($Subject =~ m/\b(sex|microsoft|Watch)\b/ ) {
return action_bounce(bad subject);
}

However, as others have pointed out, it's not generally a good idea. 
Spammers change their subjects often enough that you'll have trouble 
keeping up.  Plus, you'll be very prone to false-positives.


Agreed.  One might say, Watch out for false positives.

--
Kelson Vibber
SpeedGate Communications www.speed.net
___
NOTE: If there is a disclaimer or other legal boilerplate in the above
message, it is NULL AND VOID.  You may ignore it.

Visit http://www.mimedefang.org and http://www.roaringpenguin.com
MIMEDefang mailing list MIMEDefang@lists.roaringpenguin.com
http://lists.roaringpenguin.com/mailman/listinfo/mimedefang


Re: [Mimedefang] regex filter unwanted words

2007-01-23 Thread Joseph Brennan



Kelson [EMAIL PROTECTED] wrote:

  if($Subject =~ m/\b(sex|microsoft|Watch)\b/ )

 One might say, Watch out for false positives.



Just don't say it in the subject line!

Joseph Brennan
Lead Email Systems Engineer
Columbia University Information Technology

___
NOTE: If there is a disclaimer or other legal boilerplate in the above
message, it is NULL AND VOID.  You may ignore it.

Visit http://www.mimedefang.org and http://www.roaringpenguin.com
MIMEDefang mailing list MIMEDefang@lists.roaringpenguin.com
http://lists.roaringpenguin.com/mailman/listinfo/mimedefang


Re: [Mimedefang] regex filter unwanted words

2007-01-23 Thread Ben Kamen

Or, as Kelson was once quoted (and now immortalized on my website since I 
laughed so hard)


Can I bounce be looking at keywords in the body without using spamassassin?


Can you? Yes.

Should you?  Probably not.

Blocking mail by keyword is considerably more likely to cause false positives
than score-based filters.  Some examples:

State of Virginia.
Breast cancer study.
The city of Intercourse, Pennsylvania.
News about assassinations.
Jokes or news about certain highly-advertised drugs.
Free software.
A sextet.  (Or sextuplets, or cities like Middlesex, Essex, Wessex, etc.)
John Hancock

You can probably think of more examples.

Plus, of course, $P@/\/\/\/\ERZ can just D|5GUl$3 orr miiispel there wurdz 2
@V0|D the keyword filter.  By the time you put together a sufficiently long
list of variations you may as well be using something more elaborate.


Kelson Vibber 
SpeedGate Communications www.speed.net



--
Ben Kamen
=
Email: bkamen AT benjammin DOT net  Web: http://www.benjammin.net

The ornaments of our house are the friends that frequent it.
-- Ralph Waldo Emerson
___
NOTE: If there is a disclaimer or other legal boilerplate in the above
message, it is NULL AND VOID.  You may ignore it.

Visit http://www.mimedefang.org and http://www.roaringpenguin.com
MIMEDefang mailing list MIMEDefang@lists.roaringpenguin.com
http://lists.roaringpenguin.com/mailman/listinfo/mimedefang


Re: [Mimedefang] regex filter unwanted words

2007-01-23 Thread WBrown
  You can probably think of more examples.

I always liked the example of the town of Scunthorpe in the UK.  See 
http://en.wikipedia.org/wiki/Scunthorpe_Problem

My wife used have problems with Hiscock being part of her employer's 
domain name.
___
NOTE: If there is a disclaimer or other legal boilerplate in the above
message, it is NULL AND VOID.  You may ignore it.

Visit http://www.mimedefang.org and http://www.roaringpenguin.com
MIMEDefang mailing list MIMEDefang@lists.roaringpenguin.com
http://lists.roaringpenguin.com/mailman/listinfo/mimedefang


Re: [Mimedefang] regex filter unwanted words

2007-01-23 Thread Richard Laager
On Tue, 2007-01-23 at 08:51 -0500, [EMAIL PROTECTED] wrote:
 John Rudd wrote on 01/22/2007 06:17:48 PM:
 
  As many as you can fit.  But I would be very careful about it.  Plus, I 
  would make sure to use \b around the words, so that you're not getting 
 
  sub-string matches.  For example:
  
  \bsex\b  will match sex but not match Wesex.
 
 I can't second this strongly enough!  I had a very *IRATE* user 
 complaining about not receiving email from his boss.  Turns out he had 
 created a rule in his mail client to block a certain four letter word and 
 forgot about it.  The problem started when he added his title Programmer 
 Analyst to his signature block and he stopped getting replies to his 
 messages.

The best one I ever ran into went like this: A user calls in to complain
that large attachments are being blocked. Smaller attachments work, but
at some unknown point when the messages become too big, they are
blocked. We eventually narrowed it down to a filter on sex (as well as
some others for 4-letter words) anywhere in the message body. My theory
was that as messages with attachments got larger and larger, the
probability of them containing sex in the base-64 encoded data
approached one. We disabled that filter rule, and everything worked
great again.

Richard


___
NOTE: If there is a disclaimer or other legal boilerplate in the above
message, it is NULL AND VOID.  You may ignore it.

Visit http://www.mimedefang.org and http://www.roaringpenguin.com
MIMEDefang mailing list MIMEDefang@lists.roaringpenguin.com
http://lists.roaringpenguin.com/mailman/listinfo/mimedefang


[Mimedefang] regex filter unwanted words

2007-01-22 Thread dick hoogendijk
Some time ago I asked about filtering unwanted words. The advice was /
is not to do it, but I still want to try.

The filter rule was something like:

if($Subject =~ m// ) {
return action_bounce(bad subject);
}

Question: do I put the unwanted words into this rule like this:

if($Subject =~ m/sex|microsoft|Watch/ ) {
return action_bounce(bad subject);
}

I'm not sure how to put in the regex. How many words can I put between
those two slashes of m// ?

___
NOTE: If there is a disclaimer or other legal boilerplate in the above
message, it is NULL AND VOID.  You may ignore it.

Visit http://www.mimedefang.org and http://www.roaringpenguin.com
MIMEDefang mailing list MIMEDefang@lists.roaringpenguin.com
http://lists.roaringpenguin.com/mailman/listinfo/mimedefang


Re: [Mimedefang] regex filter unwanted words

2007-01-22 Thread John Rudd

dick hoogendijk wrote:

Some time ago I asked about filtering unwanted words. The advice was /
is not to do it, but I still want to try.

The filter rule was something like:

if($Subject =~ m// ) {
return action_bounce(bad subject);
}

Question: do I put the unwanted words into this rule like this:

if($Subject =~ m/sex|microsoft|Watch/ ) {
return action_bounce(bad subject);
}

I'm not sure how to put in the regex. How many words can I put between
those two slashes of m// ?



As many as you can fit.  But I would be very careful about it.  Plus, I 
would make sure to use \b around the words, so that you're not getting 
sub-string matches.  For example:


\bsex\b  will match sex but not match Wesex.

So, maybe something like this:

if($Subject =~ m/\b(sex|microsoft|Watch)\b/ ) {
return action_bounce(bad subject);
}

However, as others have pointed out, it's not generally a good idea. 
Spammers change their subjects often enough that you'll have trouble 
keeping up.  Plus, you'll be very prone to false-positives.


___
NOTE: If there is a disclaimer or other legal boilerplate in the above
message, it is NULL AND VOID.  You may ignore it.

Visit http://www.mimedefang.org and http://www.roaringpenguin.com
MIMEDefang mailing list MIMEDefang@lists.roaringpenguin.com
http://lists.roaringpenguin.com/mailman/listinfo/mimedefang


Re: [Mimedefang] regex filter unwanted words

2007-01-22 Thread Kenneth Porter
--On Monday, January 22, 2007 10:09 PM +0100 dick hoogendijk 
[EMAIL PROTECTED] wrote:



I'm not sure how to put in the regex. How many words can I put between
those two slashes of m// ?


Here's the Perl man page for regular expressions:

http://www.perl.com/doc/manual/html/pod/perlre.html


___
NOTE: If there is a disclaimer or other legal boilerplate in the above
message, it is NULL AND VOID.  You may ignore it.

Visit http://www.mimedefang.org and http://www.roaringpenguin.com
MIMEDefang mailing list MIMEDefang@lists.roaringpenguin.com
http://lists.roaringpenguin.com/mailman/listinfo/mimedefang