SA RegEx Rules

2009-06-27 Thread Cory Hawkless
Hi all,

 

Been doing some reading on RegEx and even coming from a programming
background it is a bit intimidating, my problem is I haven't been able to
find a good source of information on exactly what\how SpamAssassin matches
the RegEx rules when scanning and what variant of RegEx is being used?(I.E
what syntax is and is not allowed?)

 

I'd like to be able to make my own simple rules but it's proving quite
difficult, Maybe a tool that I can use the build Regular Expressions would
help?

 

I'm sure there are PELNTY of other out ther that are rather bamboozled by
this also and would benefit greatly from any assistance.

 

Thanks in advance

Cory

 

 



Re: SORBS bites the dust

2009-06-27 Thread rich...@buzzhost.co.uk
On Fri, 2009-06-26 at 21:06 -0400, Charles Gregory wrote:
 On Fri, 26 Jun 2009, LuKreme wrote:
   See, it all comes down to what you think 'legitimate' is.
  The recipient wants the e-mail. DUH.
  That's not my definition at all
 
 The very reason for my posting. You need not repeat yourself.
 
  . it's not even the definition of any mailadmin I've ever met. We 
  reject mail users *want* all the time. It's our job.
There is some mileage in that. Inappropriate use by staff mailing
massive, unnecessary attachments around is once such policy. The
recipients may well *want* these - but policies are often in place to
limit them.
 That got a genuine laugh Sounds like something out of the BOFH series.
 
  Nope, sometimes people WANT email that is laden down with malware, 
  viruses, executable files, web bugs, or other things that compromise the 
  security of not just themselves, but of others.
Yep - I've had users call up asking why they have not had a email with a
file attachment they are expecting. You tell them It has a virus or
It is not company policy to accept executable files by email but do
they stop there. Oh no. They get the sender to try and forward it via
Hotmail or to a webmail account. When that blocks it too, you see the
sender try again - this time zipping it up and crap. So yes - there are
occasions when mailadmins block mail that recipients want and it is
correct to do so.

The thread has drifted and seems to be starting to take on the roll of
the Oxford English Dictionary of IT related Words.

Legitimate mail? Just what is it? One man's legitimate is another man's
illegitimate. One man's spam is another man's ham.

I apply a simple formula.
Legitimate mail comes from mail servers running on static IP's. These
will not fall in a range assigned as Dynamic. They will not be listed in
the PBL. The connecting IP will have - as a minimum - a PTR record. The
contents of which I'm not fussed about - it just needs to exist. That
will have me at least happy to 'listen' to what that server has to say
before making a decision on the mail it is sending. I've dealt with
small African businesses out in the bush operating mail servers over
miles of knackered telephone lines on modems, and even they can manage
to satisfy such basic requirements. If any other mail admin is not
capable of doing this then I don't want a connection from them (I
probably would not want them working for my organisation either - not if
I relied on email for my business).

Email has some similarities to snail mail. The onus is on the sender to
ship it correctly and NOT on the recipient. The sender must package and
address it correctly, put the right postage on it, and send it from the
correct place if you want delivery attempted on time or at all. You
would not expect your snail mail to be collected from a trash can and
delivered, you would use a defined mail box or post office.

Legitimate mail to me comes from a legitimate server as above. It's
content will then be;
1. A reply to a mail we have sent
2. An order, enquiry or quote
3. A staff message or memo
4. A request for help

There may be a few others, but legitimate mail will not generally be;

1. Someone trying to sell us something
2. Notifications of 'Special Offers'
3. Catch up mails from people we once bought a pencil from
4. From gmail, yahoo or hotmail. By far all I ever see from these
providers is Spam. If someone really does *not* have access to any other
form of email they can pick up the phone and call us and we can exempt
them. I've yet to find a legitimate business use any of them as their
primary email provider. Postini customers are also pushing their luck
with the way the sending server never sends a 'QUIT' on the end of the
session. This kind of sloppy crap is a different story but is mentioned
to show that even so called professional email organisations can be
sloppy and not do things as they should.

Finally - and this is the point where it is specifically relevant to
Spamassassin - it won't trip a set score in SA. There is no need for
legitimate mail to score high with SA.

That's my take on it and it works for us. We get the odd gripe from
managers called 'Steve' and 'Barry' that they have not had the 200 meg
of pictures from the weekend party. You know the kind - the self
important 'rules are not relevant to me' kind. It is usually sufficient
to remind them of the acceptable usage policy and that we are
overstaffed.






Re: SA RegEx Rules

2009-06-27 Thread rich...@buzzhost.co.uk
On Sat, 2009-06-27 at 16:56 +0930, Cory Hawkless wrote:
 Hi all,
 
  
 
 Been doing some reading on RegEx and even coming from a programming
 background it is a bit intimidating, my problem is I haven’t been able
 to find a good source of information on exactly what\how SpamAssassin
 matches the RegEx rules when scanning and what variant of RegEx is
 being used?(I.E what syntax is and is not allowed?)
 
  
 
 I’d like to be able to make my own simple rules but it’s proving quite
 difficult, Maybe a tool that I can use the build Regular Expressions
 would help?
 
  
 
 I’m sure there are PELNTY of other out ther that are rather bamboozled
 by this also and would benefit greatly from any assistance.
 
  
 
 Thanks in advance
 
 Cory
 
  
http://www.regexbuddy.com/



Re: SORBS bites the dust

2009-06-27 Thread Michael Grant
Unless I've missed a message... this is the 100th reply to this
thread.  This has to be one of the longest threads I've seen on this
list in years.

I have to say I have issues with your definition of legit mail.  Many
people do send mail to other people out of the blue for legit reasons
other than having some previous relation with that person.

 4. From gmail, yahoo or hotmail.

These sites do provide an important service for people.  Not everyone
is tech savy to get their own domain name.  If everyone had to use
their ISP's domain name, think of the mess each time you change your
ISP.

But in general, there is definitely a grey area about what is and what
isn't legit email and I have to say that spamassassin does do a pretty
decent job much of the time sorting it out.


Re: SORBS bites the dust

2009-06-27 Thread Arvid Picciani

Michael Grant wrote:

Unless I've missed a message... this is the 100th reply to this
thread.  This has to be one of the longest threads I've seen on this
list in years.

  
Shows there is much to discuss on this matter. Isn't there a generic 
spam related  mailing list?


Re: SORBS bites the dust

2009-06-27 Thread Yet Another Ninja

On 6/27/2009 10:55 AM, Arvid Picciani wrote:

Michael Grant wrote:

Unless I've missed a message... this is the 100th reply to this
thread.  This has to be one of the longest threads I've seen on this
list in years.

  
Shows there is much to discuss on this matter. Isn't there a generic 
spam related  mailing list?


spam-l.com


Re: [NEW SPAM FLOOD] www.shopXX.net

2009-06-27 Thread Jeremy Morton
Why are you bothering with that?  It seems unnecessarily complex. 
Here's my amended rule:


/\bwww\s?\W?\s?\w{3,6}\d{2,6}s?\W?\s?(?:c\s?o\s?m|n\s?e\s?t|o\s?r\s?g)\b/i

Best regards,
Jeremy Morton (Jez)

John Hardin wrote:

On Fri, 26 Jun 2009, Pawe�~B T�~Ycza wrote:


Dnia 2009-06-23, wto o godzinie 09:39 +0200, Paweł Tęcza pisze:


body OBFU_URI_WWDD_2
/\bwww\s(?:\W\s)?\w{3,6}\d{2,6}\s(?:\W\s)?(?:c\s?o\s?m|n\s?e\s?t|o\s?r\s?g)\b/i



The spammers strike in weekend again. Unfortunately the rule above
doesn't work for the latest incarnation of that spam, it means www.
pill22. com.


{sung to the tune of Peter Gabriel's Kiss That Frog} Whack that mole!

/\bwww(?:\s|\s\W|\W\s)\w{3,6}\d{2,6}(?:\s|s\W|\W\s)(?:c\s?o\s?m|n\s?e\s?t|o\s?r\s?g)\b/i




Re: SORBS bites the dust

2009-06-27 Thread rich...@buzzhost.co.uk
On Sat, 2009-06-27 at 10:59 +0200, Yet Another Ninja wrote:
 On 6/27/2009 10:55 AM, Arvid Picciani wrote:
  Michael Grant wrote:
  Unless I've missed a message... this is the 100th reply to this
  thread.  This has to be one of the longest threads I've seen on this
  list in years.
 

  Shows there is much to discuss on this matter. Isn't there a generic 
  spam related  mailing list?
 
 spam-l.com
NANAE ?



Re: SA RegEx Rules

2009-06-27 Thread RW
On Sat, 27 Jun 2009 16:56:33 +0930
Cory Hawkless c...@hawkless.id.au wrote:

 Hi all,
 
  
 
 Been doing some reading on RegEx and even coming from a programming
 background it is a bit intimidating, my problem is I haven't been
 able to find a good source of information on exactly what\how
 SpamAssassin matches the RegEx rules when scanning and what variant
 of RegEx is being used?(I.E what syntax is and is not allowed?)


 Perl  


Re: SA RegEx Rules

2009-06-27 Thread Karsten Bräckelmann
On Sat, 2009-06-27 at 16:56 +0930, Cory Hawkless wrote:
 Been doing some reading on RegEx and even coming from a programming
 background it is a bit intimidating, my problem is I haven’t been able
 to find a good source of information on exactly what\how SpamAssassin
 matches the RegEx rules when scanning

Depends on the rule type you use. See the Rule Definitions section in
the docs [1] and the Rule Writing intro guide [2] in the wiki.

Between header, uri, body, and the other body rules, the scope of
matching varies significantly. The type of rule also may have an impact
on how the data is rendered (or not).

 and what variant of RegEx is being used?(I.E what syntax is and is not
 allowed?)

Perl Regular Expressions. The perlre [3] docs are quite heavy and best
used as a comprehensive reference, though. For some gentle introduction
and a longer tutorial, see the links in the Description section of [3].


 I’d like to be able to make my own simple rules but it’s proving quite
 difficult, Maybe a tool that I can use the build Regular Expressions
 would help? 
 
 I’m sure there are PELNTY of other out ther that are rather bamboozled
 by this also and would benefit greatly from any assistance.

Besides writing your own custom rules -- this list is a good source for
general advice, and ready-made rules targeting new spammer patterns. I
suggest checking the (recent-ish) archives and lurking on list. You can
learn and catch a great lot by that.

  guenther


[1] http://spamassassin.apache.org/full/3.2.x/doc/Mail_SpamAssassin_Conf.html
[2] http://wiki.apache.org/spamassassin/WritingRules
[3] http://perldoc.perl.org/perlre.html

-- 
char *t=\10pse\0r\0dtu...@ghno\x4e\xc8\x79\xf4\xab\x51\x8a\x10\xf4\xf4\xc4;
main(){ char h,m=h=*t++,*x=t+2*h,c,i,l=*x,s=0; for (i=0;il;i++){ i%8? c=1:
(c=*++x); c128  (s+=h); if (!(h=1)||!t[s+h]){ putchar(t[s]);h=m;s=0; }}}



gpg signed spam email ???

2009-06-27 Thread RobertH

i was reading at

http://www.karan.org/blog/

specifically

http://www.karan.org/blog/index.php/2009/06/15/gpg-signed-spam

that he recv'd a gpg signed spam email

ive never heard of that before yet i havent thought much about it or studied
it...

Q: is this unheard of, or common?

near as i can quickly investigate, it doesnt appear to be common as per
papa google [sic].

comments? feedback?

just trying to get up on the curve now.

tia

 - rh



Re: gpg signed spam email ???

2009-06-27 Thread Benny Pedersen

On Sat, June 27, 2009 16:02, RobertH wrote:

 just trying to get up on the curve now.

it all turns downto do you trust the sender ?, whether you verify this with gpg 
or not is not the point

Mail::SpamAssassin::Plugin::Konfidi
Mail::SpamAssassin::Plugin::OpenPGP

both can use gpg as a verify on trusted senders, but why not use dkim ?

--
xpoint



Re: [NEW SPAM FLOOD] www.shopXX.net

2009-06-27 Thread John Hardin

On Sat, 27 Jun 2009, Jeremy Morton wrote:

Why are you bothering with that?  It seems unnecessarily complex. Here's my 
amended rule:


/\bwww\s?\W?\s?\w{3,6}\d{2,6}s?\W?\s?(?:c\s?o\s?m|n\s?e\s?t|o\s?r\s?g)\b/i


That would match hy11com, which may not be recognized by the mark as a 
URI they need to deobfuscate - do you really want it that loose?


That would match www.why11.com, which the regular URI processing would 
match - do you really want to match it twice?


That's why I posted a more-complex version.

Note that I'm not saying it's wrong, just that it's looser than I prefer.

--
 John Hardin KA7OHZhttp://www.impsec.org/~jhardin/
 jhar...@impsec.orgFALaholic #11174 pgpk -a jhar...@impsec.org
 key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C  AF76 D822 E6E6 B873 2E79
---
  False is the idea of utility that sacrifices a thousand real
  advantages for one imaginary or trifling inconvenience; that would
  take fire from men because it burns, and water because one may drown
  in it; that has no remedy for evils except destruction. The laws
  that forbid the carrying of arms are laws of such a nature. They
  disarm only those who are neither inclined nor determined to commit
  crime.   -- Cesare Beccaria, quoted by Thomas Jefferson
---
 7 days until the 233rd anniversary of the Declaration of Independence


Re: gpg signed spam email ???

2009-06-27 Thread Matt Kettler
RobertH wrote:
 i was reading at

 http://www.karan.org/blog/

 specifically

 http://www.karan.org/blog/index.php/2009/06/15/gpg-signed-spam

 that he recv'd a gpg signed spam email

 ive never heard of that before yet i havent thought much about it or studied
 it...

 Q: is this unheard of, or common?

 near as i can quickly investigate, it doesnt appear to be common as per
 papa google [sic].

 comments? feedback?

 just trying to get up on the curve now.

Well, let's put it this way:

A long, long time ago, SA had a rule in the default set, giving negative
score to PGP and GPG signed messages. Quickly, spammers started adding
enough fragments of a signature to match the rule. This was very
obvious, as the rule only matched the begin clause, and the spams had a
begin clause dropped at the bottom of the message, with no end clause.

The rule could have been modified to validate the signature, but of
course, anyone can GPG sign a message and have it be valid, and the
spammers probably would have done so if the rule changed. Therefore, the
rule was dropped from the set entirely.

GPG signatures only validate that the sender has the private key that
matches the public one signing the email. Like SPF, and many other
authentication only technologies, this doesn't tell you anything about
the sender. Even perfect authentication at best only provides
confirmation of who the sender is, and most of these technologies only
prove a sender is the proper owner holder of some abstract identity like
a key or domain.

Authentication needs to be paired with recognition to be meaningful.  If
a sender proves who they are, will you immediately accept the email
without further question? What if they just proved they were Alan Ralsky?

http://www.spamhaus.org/rokso/listing.lasso?-op=cnspammer=Alan%20Ralsky


Moral of the story: don't assign negative scores to systems that only
provide authentication, unless you're somehow pairing it with proof the
sender is someone you actually trust (or at least is trusted by a
service you trust, etc).

Ever notice that the negative score of SPF_PASS is insignificantly
small, there's a reason for that.. Spammers can pass SPF too, so by
itself, it's meaningless. But paired with your explicit trust of a
domain or sender, it provides forgery resistant whitelisting
(whitelist_from_spf).








 


Re: [NEW SPAM FLOOD] www.shopXX.net

2009-06-27 Thread Jason Haar
All this talk about trying to catch urls that contain spaces/etc got me
thinking: why isn't this a standard SA feature? i.e if SA sees
www(whitespace|comma|period)-combo(therest), then rewrite it as the
url and process.

That way you get the whole force of SURBLs/etc onto it? I'm assuming all
these shop urls this thread has been agonizing about are already in
RBLs of course...

-- 
Cheers

Jason Haar
Information Security Manager, Trimble Navigation Ltd.
Phone: +64 3 9635 377 Fax: +64 3 9635 417
PGP Fingerprint: 7A2E 0407 C9A6 CAF6 2B9F 8422 C063 5EBB FE1D 66D1



Re: [NEW SPAM FLOOD] www.shopXX.net

2009-06-27 Thread Benny Pedersen

On Sun, June 28, 2009 01:57, Jason Haar wrote:
 All this talk about trying to catch urls that contain spaces/etc got me
 thinking: why isn't this a standard SA feature? i.e if SA sees
 www(whitespace|comma|period)-combo(therest), then rewrite it as the
 url and process.

spammers need to rewrite webbrowsers also :=)

will you click on a url that is not click bare ?

 That way you get the whole force of SURBLs/etc onto it? I'm assuming all
 these shop urls this thread has been agonizing about are already in
 RBLs of course...

one could extend rule set to use ReplaceTags ?

-- 
xpoint



RE: SA RegEx Rules

2009-06-27 Thread Cory Hawkless
Ahh, I have played with regexbuddy but when copy and pasting the SA rules in it 
does strange things that are inconsistent with the result i get from SA, These 
recent shopxx rules have been good examples but I cant get regexbuddy to 
reproduce the expected results?

Has anyone used regexbuddy before?

-Original Message-
From: rich...@buzzhost.co.uk [mailto:rich...@buzzhost.co.uk] 
Sent: Saturday, 27 June 2009 5:12 PM
Cc: users@spamassassin.apache.org
Subject: Re: SA RegEx Rules

On Sat, 2009-06-27 at 16:56 +0930, Cory Hawkless wrote:
 Hi all,
 
  
 
 Been doing some reading on RegEx and even coming from a programming
 background it is a bit intimidating, my problem is I haven???t been able
 to find a good source of information on exactly what\how SpamAssassin
 matches the RegEx rules when scanning and what variant of RegEx is
 being used?(I.E what syntax is and is not allowed?)
 
  
 
 I???d like to be able to make my own simple rules but it???s proving quite
 difficult, Maybe a tool that I can use the build Regular Expressions
 would help?
 
  
 
 I???m sure there are PELNTY of other out ther that are rather bamboozled
 by this also and would benefit greatly from any assistance.
 
  
 
 Thanks in advance
 
 Cory
 
  
http://www.regexbuddy.com/




RE: [NEW SPAM FLOOD] www.shopXX.net

2009-06-27 Thread Cory Hawkless
I agree, wouldn't it be easier to uniformly feed all of these type of URL's
though the already existing SA filters. As Jason suggested maybe by
collapsing whitespaces?

Sounds like the obvious solution to me? Any problems with this? If not how
can it be done?


-Original Message-
From: Jason Haar [mailto:jason.h...@trimble.co.nz] 
Sent: Sunday, 28 June 2009 9:28 AM
To: users@spamassassin.apache.org
Subject: Re: [NEW SPAM FLOOD] www.shopXX.net

All this talk about trying to catch urls that contain spaces/etc got me
thinking: why isn't this a standard SA feature? i.e if SA sees
www(whitespace|comma|period)-combo(therest), then rewrite it as the
url and process.

That way you get the whole force of SURBLs/etc onto it? I'm assuming all
these shop urls this thread has been agonizing about are already in
RBLs of course...

-- 
Cheers

Jason Haar
Information Security Manager, Trimble Navigation Ltd.
Phone: +64 3 9635 377 Fax: +64 3 9635 417
PGP Fingerprint: 7A2E 0407 C9A6 CAF6 2B9F 8422 C063 5EBB FE1D 66D1




RE: [NEW SPAM FLOOD] www.shopXX.net

2009-06-27 Thread Benny Pedersen

On Sun, June 28, 2009 05:38, Cory Hawkless wrote:
 I agree, wouldn't it be easier to uniformly feed all of these type of URL's
 though the already existing SA filters. As Jason suggested maybe by
 collapsing whitespaces?

lets redefine how a url is in the first place ?

www localhost localdomain
www.localhost.localdomain

one of them does not work :)

spammers more or less just use the first one, so what ?

 Sounds like the obvious solution to me? Any problems with this? If not how
 can it be done?

just show a working ReplaceTags for spaces, and then all can be solved to make 
rules with how spaces can rebuild into no spaces,
eg in my above example   will be . and then sa see the last url and first 
url

imho this is what replacetags does

but as long webbrowsers does not work on both, is it a big problem so ?

-- 
xpoint



SA on Windows (XP) with Cygwin

2009-06-27 Thread Lee


Hello René and anyone else who has run SA on Windows under Cygwin,

I've been dabbling a little with this, having not used Cygwin 
beforehand, and I think I have grasped the basic operational principles 
of installing/building modules and SA, but it appears it may turn out to 
be a waste of time and fruitless venture; by that I mean it seems to be 
a pain or impossible to get various modules working including things 
like DCC and Razor. Is this indeed the case?

I think however René earlier said he had a 'full' install working?

Bearing in mind I am only a Windows XP person, whose grasp of 'command 
line' operations previously went no further than basic .bat files or 
using 'run' and typing 'cmd' 'ok', I need a relatively guaranteed and 
specific guide on how to get all the above working, should it in fact be 
possible.
As usual, I've found various bits and pieces on the net on SA under 
Cygwin, but none fully comprehensive or seemingly up to date.


My thinking is;  I only want to pursue SA under Cygwin if I can at least 
achieve a better and more up to date equivalent of the Sourceforge 
September 2007 package of SAwin32/SAProxy. Otherwise, I may as well 
revert to using Popfile which is an active project although just a naive 
bayesian training method. (I think.)
I appreciate I will also need to install some other bits to pipe the 
emails into SA and out to a desktop email client; presumably this will 
involve something like Procmail or  Fetchmail which were mentioned 
recently. Maybe even a Windows mail server. I do appreciate all this may 
be considered highly excessive for the sake of filtering personal email, 
but I like a project as long as I know how to do it specifically and can 
be assured it will work.


For info, here's what that SAwin32/SAProxy package claims/claimed to do:
---
SpamAssassin POP3 Proxy for Win32 (SAproxy) v3.2.3.3

Includes SpamAssassin v3.2.3, DCC v1.3.58 and Vipul's Razor v2.84.

This tool is a free and powerful spam filter for any Microsoft
Windows mail client (Outlook Express, Eudora, Microsoft Outlook, etc.).
It supports SSL, but it is for POP3 accounts only and will not work with
IMAP, Exchange, Lotus, web-based (such as Hotmail) and other non-POP3
accounts.

It includes SpamAssassin (http://spamassassin.apache.org/)
and fully supports online spam databases 
DCC(http://www.rhyolite.com/anti-spam/dcc/) and Razor 
(http://razor.sf.net/).

This build is based on free SAproxy v1.2 and is not associated with
Stata Labs (which no longer sells SAproxy Pro).
--

So, can an up to date and painless build of that be achieved under 
Cygwin? If so, which specific Cygwin and CPAN modules (and versions) 
will work with SA 3.2.5 or at least SA 3.2.4?


Thanks in advance;
if you feel it is most likely going to be a pain and an unknown 
quantity, just tell me so, and I won't frustrate myself with attempting 
this line of thought any further.

I appreciate one or two posters have already said / implied this.

Lee
UK