SA RegEx Rules
Hi all, Been doing some reading on RegEx and even coming from a programming background it is a bit intimidating, my problem is I haven't been able to find a good source of information on exactly what\how SpamAssassin matches the RegEx rules when scanning and what variant of RegEx is being used?(I.E what syntax is and is not allowed?) I'd like to be able to make my own simple rules but it's proving quite difficult, Maybe a tool that I can use the build Regular Expressions would help? I'm sure there are PELNTY of other out ther that are rather bamboozled by this also and would benefit greatly from any assistance. Thanks in advance Cory
Re: SORBS bites the dust
On Fri, 2009-06-26 at 21:06 -0400, Charles Gregory wrote: On Fri, 26 Jun 2009, LuKreme wrote: See, it all comes down to what you think 'legitimate' is. The recipient wants the e-mail. DUH. That's not my definition at all The very reason for my posting. You need not repeat yourself. . it's not even the definition of any mailadmin I've ever met. We reject mail users *want* all the time. It's our job. There is some mileage in that. Inappropriate use by staff mailing massive, unnecessary attachments around is once such policy. The recipients may well *want* these - but policies are often in place to limit them. That got a genuine laugh Sounds like something out of the BOFH series. Nope, sometimes people WANT email that is laden down with malware, viruses, executable files, web bugs, or other things that compromise the security of not just themselves, but of others. Yep - I've had users call up asking why they have not had a email with a file attachment they are expecting. You tell them It has a virus or It is not company policy to accept executable files by email but do they stop there. Oh no. They get the sender to try and forward it via Hotmail or to a webmail account. When that blocks it too, you see the sender try again - this time zipping it up and crap. So yes - there are occasions when mailadmins block mail that recipients want and it is correct to do so. The thread has drifted and seems to be starting to take on the roll of the Oxford English Dictionary of IT related Words. Legitimate mail? Just what is it? One man's legitimate is another man's illegitimate. One man's spam is another man's ham. I apply a simple formula. Legitimate mail comes from mail servers running on static IP's. These will not fall in a range assigned as Dynamic. They will not be listed in the PBL. The connecting IP will have - as a minimum - a PTR record. The contents of which I'm not fussed about - it just needs to exist. That will have me at least happy to 'listen' to what that server has to say before making a decision on the mail it is sending. I've dealt with small African businesses out in the bush operating mail servers over miles of knackered telephone lines on modems, and even they can manage to satisfy such basic requirements. If any other mail admin is not capable of doing this then I don't want a connection from them (I probably would not want them working for my organisation either - not if I relied on email for my business). Email has some similarities to snail mail. The onus is on the sender to ship it correctly and NOT on the recipient. The sender must package and address it correctly, put the right postage on it, and send it from the correct place if you want delivery attempted on time or at all. You would not expect your snail mail to be collected from a trash can and delivered, you would use a defined mail box or post office. Legitimate mail to me comes from a legitimate server as above. It's content will then be; 1. A reply to a mail we have sent 2. An order, enquiry or quote 3. A staff message or memo 4. A request for help There may be a few others, but legitimate mail will not generally be; 1. Someone trying to sell us something 2. Notifications of 'Special Offers' 3. Catch up mails from people we once bought a pencil from 4. From gmail, yahoo or hotmail. By far all I ever see from these providers is Spam. If someone really does *not* have access to any other form of email they can pick up the phone and call us and we can exempt them. I've yet to find a legitimate business use any of them as their primary email provider. Postini customers are also pushing their luck with the way the sending server never sends a 'QUIT' on the end of the session. This kind of sloppy crap is a different story but is mentioned to show that even so called professional email organisations can be sloppy and not do things as they should. Finally - and this is the point where it is specifically relevant to Spamassassin - it won't trip a set score in SA. There is no need for legitimate mail to score high with SA. That's my take on it and it works for us. We get the odd gripe from managers called 'Steve' and 'Barry' that they have not had the 200 meg of pictures from the weekend party. You know the kind - the self important 'rules are not relevant to me' kind. It is usually sufficient to remind them of the acceptable usage policy and that we are overstaffed.
Re: SA RegEx Rules
On Sat, 2009-06-27 at 16:56 +0930, Cory Hawkless wrote: Hi all, Been doing some reading on RegEx and even coming from a programming background it is a bit intimidating, my problem is I haven’t been able to find a good source of information on exactly what\how SpamAssassin matches the RegEx rules when scanning and what variant of RegEx is being used?(I.E what syntax is and is not allowed?) I’d like to be able to make my own simple rules but it’s proving quite difficult, Maybe a tool that I can use the build Regular Expressions would help? I’m sure there are PELNTY of other out ther that are rather bamboozled by this also and would benefit greatly from any assistance. Thanks in advance Cory http://www.regexbuddy.com/
Re: SORBS bites the dust
Unless I've missed a message... this is the 100th reply to this thread. This has to be one of the longest threads I've seen on this list in years. I have to say I have issues with your definition of legit mail. Many people do send mail to other people out of the blue for legit reasons other than having some previous relation with that person. 4. From gmail, yahoo or hotmail. These sites do provide an important service for people. Not everyone is tech savy to get their own domain name. If everyone had to use their ISP's domain name, think of the mess each time you change your ISP. But in general, there is definitely a grey area about what is and what isn't legit email and I have to say that spamassassin does do a pretty decent job much of the time sorting it out.
Re: SORBS bites the dust
Michael Grant wrote: Unless I've missed a message... this is the 100th reply to this thread. This has to be one of the longest threads I've seen on this list in years. Shows there is much to discuss on this matter. Isn't there a generic spam related mailing list?
Re: SORBS bites the dust
On 6/27/2009 10:55 AM, Arvid Picciani wrote: Michael Grant wrote: Unless I've missed a message... this is the 100th reply to this thread. This has to be one of the longest threads I've seen on this list in years. Shows there is much to discuss on this matter. Isn't there a generic spam related mailing list? spam-l.com
Re: [NEW SPAM FLOOD] www.shopXX.net
Why are you bothering with that? It seems unnecessarily complex. Here's my amended rule: /\bwww\s?\W?\s?\w{3,6}\d{2,6}s?\W?\s?(?:c\s?o\s?m|n\s?e\s?t|o\s?r\s?g)\b/i Best regards, Jeremy Morton (Jez) John Hardin wrote: On Fri, 26 Jun 2009, Pawe�~B T�~Ycza wrote: Dnia 2009-06-23, wto o godzinie 09:39 +0200, Paweł Tęcza pisze: body OBFU_URI_WWDD_2 /\bwww\s(?:\W\s)?\w{3,6}\d{2,6}\s(?:\W\s)?(?:c\s?o\s?m|n\s?e\s?t|o\s?r\s?g)\b/i The spammers strike in weekend again. Unfortunately the rule above doesn't work for the latest incarnation of that spam, it means www. pill22. com. {sung to the tune of Peter Gabriel's Kiss That Frog} Whack that mole! /\bwww(?:\s|\s\W|\W\s)\w{3,6}\d{2,6}(?:\s|s\W|\W\s)(?:c\s?o\s?m|n\s?e\s?t|o\s?r\s?g)\b/i
Re: SORBS bites the dust
On Sat, 2009-06-27 at 10:59 +0200, Yet Another Ninja wrote: On 6/27/2009 10:55 AM, Arvid Picciani wrote: Michael Grant wrote: Unless I've missed a message... this is the 100th reply to this thread. This has to be one of the longest threads I've seen on this list in years. Shows there is much to discuss on this matter. Isn't there a generic spam related mailing list? spam-l.com NANAE ?
Re: SA RegEx Rules
On Sat, 27 Jun 2009 16:56:33 +0930 Cory Hawkless c...@hawkless.id.au wrote: Hi all, Been doing some reading on RegEx and even coming from a programming background it is a bit intimidating, my problem is I haven't been able to find a good source of information on exactly what\how SpamAssassin matches the RegEx rules when scanning and what variant of RegEx is being used?(I.E what syntax is and is not allowed?) Perl
Re: SA RegEx Rules
On Sat, 2009-06-27 at 16:56 +0930, Cory Hawkless wrote: Been doing some reading on RegEx and even coming from a programming background it is a bit intimidating, my problem is I haven’t been able to find a good source of information on exactly what\how SpamAssassin matches the RegEx rules when scanning Depends on the rule type you use. See the Rule Definitions section in the docs [1] and the Rule Writing intro guide [2] in the wiki. Between header, uri, body, and the other body rules, the scope of matching varies significantly. The type of rule also may have an impact on how the data is rendered (or not). and what variant of RegEx is being used?(I.E what syntax is and is not allowed?) Perl Regular Expressions. The perlre [3] docs are quite heavy and best used as a comprehensive reference, though. For some gentle introduction and a longer tutorial, see the links in the Description section of [3]. I’d like to be able to make my own simple rules but it’s proving quite difficult, Maybe a tool that I can use the build Regular Expressions would help? I’m sure there are PELNTY of other out ther that are rather bamboozled by this also and would benefit greatly from any assistance. Besides writing your own custom rules -- this list is a good source for general advice, and ready-made rules targeting new spammer patterns. I suggest checking the (recent-ish) archives and lurking on list. You can learn and catch a great lot by that. guenther [1] http://spamassassin.apache.org/full/3.2.x/doc/Mail_SpamAssassin_Conf.html [2] http://wiki.apache.org/spamassassin/WritingRules [3] http://perldoc.perl.org/perlre.html -- char *t=\10pse\0r\0dtu...@ghno\x4e\xc8\x79\xf4\xab\x51\x8a\x10\xf4\xf4\xc4; main(){ char h,m=h=*t++,*x=t+2*h,c,i,l=*x,s=0; for (i=0;il;i++){ i%8? c=1: (c=*++x); c128 (s+=h); if (!(h=1)||!t[s+h]){ putchar(t[s]);h=m;s=0; }}}
gpg signed spam email ???
i was reading at http://www.karan.org/blog/ specifically http://www.karan.org/blog/index.php/2009/06/15/gpg-signed-spam that he recv'd a gpg signed spam email ive never heard of that before yet i havent thought much about it or studied it... Q: is this unheard of, or common? near as i can quickly investigate, it doesnt appear to be common as per papa google [sic]. comments? feedback? just trying to get up on the curve now. tia - rh
Re: gpg signed spam email ???
On Sat, June 27, 2009 16:02, RobertH wrote: just trying to get up on the curve now. it all turns downto do you trust the sender ?, whether you verify this with gpg or not is not the point Mail::SpamAssassin::Plugin::Konfidi Mail::SpamAssassin::Plugin::OpenPGP both can use gpg as a verify on trusted senders, but why not use dkim ? -- xpoint
Re: [NEW SPAM FLOOD] www.shopXX.net
On Sat, 27 Jun 2009, Jeremy Morton wrote: Why are you bothering with that? It seems unnecessarily complex. Here's my amended rule: /\bwww\s?\W?\s?\w{3,6}\d{2,6}s?\W?\s?(?:c\s?o\s?m|n\s?e\s?t|o\s?r\s?g)\b/i That would match hy11com, which may not be recognized by the mark as a URI they need to deobfuscate - do you really want it that loose? That would match www.why11.com, which the regular URI processing would match - do you really want to match it twice? That's why I posted a more-complex version. Note that I'm not saying it's wrong, just that it's looser than I prefer. -- John Hardin KA7OHZhttp://www.impsec.org/~jhardin/ jhar...@impsec.orgFALaholic #11174 pgpk -a jhar...@impsec.org key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C AF76 D822 E6E6 B873 2E79 --- False is the idea of utility that sacrifices a thousand real advantages for one imaginary or trifling inconvenience; that would take fire from men because it burns, and water because one may drown in it; that has no remedy for evils except destruction. The laws that forbid the carrying of arms are laws of such a nature. They disarm only those who are neither inclined nor determined to commit crime. -- Cesare Beccaria, quoted by Thomas Jefferson --- 7 days until the 233rd anniversary of the Declaration of Independence
Re: gpg signed spam email ???
RobertH wrote: i was reading at http://www.karan.org/blog/ specifically http://www.karan.org/blog/index.php/2009/06/15/gpg-signed-spam that he recv'd a gpg signed spam email ive never heard of that before yet i havent thought much about it or studied it... Q: is this unheard of, or common? near as i can quickly investigate, it doesnt appear to be common as per papa google [sic]. comments? feedback? just trying to get up on the curve now. Well, let's put it this way: A long, long time ago, SA had a rule in the default set, giving negative score to PGP and GPG signed messages. Quickly, spammers started adding enough fragments of a signature to match the rule. This was very obvious, as the rule only matched the begin clause, and the spams had a begin clause dropped at the bottom of the message, with no end clause. The rule could have been modified to validate the signature, but of course, anyone can GPG sign a message and have it be valid, and the spammers probably would have done so if the rule changed. Therefore, the rule was dropped from the set entirely. GPG signatures only validate that the sender has the private key that matches the public one signing the email. Like SPF, and many other authentication only technologies, this doesn't tell you anything about the sender. Even perfect authentication at best only provides confirmation of who the sender is, and most of these technologies only prove a sender is the proper owner holder of some abstract identity like a key or domain. Authentication needs to be paired with recognition to be meaningful. If a sender proves who they are, will you immediately accept the email without further question? What if they just proved they were Alan Ralsky? http://www.spamhaus.org/rokso/listing.lasso?-op=cnspammer=Alan%20Ralsky Moral of the story: don't assign negative scores to systems that only provide authentication, unless you're somehow pairing it with proof the sender is someone you actually trust (or at least is trusted by a service you trust, etc). Ever notice that the negative score of SPF_PASS is insignificantly small, there's a reason for that.. Spammers can pass SPF too, so by itself, it's meaningless. But paired with your explicit trust of a domain or sender, it provides forgery resistant whitelisting (whitelist_from_spf).
Re: [NEW SPAM FLOOD] www.shopXX.net
All this talk about trying to catch urls that contain spaces/etc got me thinking: why isn't this a standard SA feature? i.e if SA sees www(whitespace|comma|period)-combo(therest), then rewrite it as the url and process. That way you get the whole force of SURBLs/etc onto it? I'm assuming all these shop urls this thread has been agonizing about are already in RBLs of course... -- Cheers Jason Haar Information Security Manager, Trimble Navigation Ltd. Phone: +64 3 9635 377 Fax: +64 3 9635 417 PGP Fingerprint: 7A2E 0407 C9A6 CAF6 2B9F 8422 C063 5EBB FE1D 66D1
Re: [NEW SPAM FLOOD] www.shopXX.net
On Sun, June 28, 2009 01:57, Jason Haar wrote: All this talk about trying to catch urls that contain spaces/etc got me thinking: why isn't this a standard SA feature? i.e if SA sees www(whitespace|comma|period)-combo(therest), then rewrite it as the url and process. spammers need to rewrite webbrowsers also :=) will you click on a url that is not click bare ? That way you get the whole force of SURBLs/etc onto it? I'm assuming all these shop urls this thread has been agonizing about are already in RBLs of course... one could extend rule set to use ReplaceTags ? -- xpoint
RE: SA RegEx Rules
Ahh, I have played with regexbuddy but when copy and pasting the SA rules in it does strange things that are inconsistent with the result i get from SA, These recent shopxx rules have been good examples but I cant get regexbuddy to reproduce the expected results? Has anyone used regexbuddy before? -Original Message- From: rich...@buzzhost.co.uk [mailto:rich...@buzzhost.co.uk] Sent: Saturday, 27 June 2009 5:12 PM Cc: users@spamassassin.apache.org Subject: Re: SA RegEx Rules On Sat, 2009-06-27 at 16:56 +0930, Cory Hawkless wrote: Hi all, Been doing some reading on RegEx and even coming from a programming background it is a bit intimidating, my problem is I haven???t been able to find a good source of information on exactly what\how SpamAssassin matches the RegEx rules when scanning and what variant of RegEx is being used?(I.E what syntax is and is not allowed?) I???d like to be able to make my own simple rules but it???s proving quite difficult, Maybe a tool that I can use the build Regular Expressions would help? I???m sure there are PELNTY of other out ther that are rather bamboozled by this also and would benefit greatly from any assistance. Thanks in advance Cory http://www.regexbuddy.com/
RE: [NEW SPAM FLOOD] www.shopXX.net
I agree, wouldn't it be easier to uniformly feed all of these type of URL's though the already existing SA filters. As Jason suggested maybe by collapsing whitespaces? Sounds like the obvious solution to me? Any problems with this? If not how can it be done? -Original Message- From: Jason Haar [mailto:jason.h...@trimble.co.nz] Sent: Sunday, 28 June 2009 9:28 AM To: users@spamassassin.apache.org Subject: Re: [NEW SPAM FLOOD] www.shopXX.net All this talk about trying to catch urls that contain spaces/etc got me thinking: why isn't this a standard SA feature? i.e if SA sees www(whitespace|comma|period)-combo(therest), then rewrite it as the url and process. That way you get the whole force of SURBLs/etc onto it? I'm assuming all these shop urls this thread has been agonizing about are already in RBLs of course... -- Cheers Jason Haar Information Security Manager, Trimble Navigation Ltd. Phone: +64 3 9635 377 Fax: +64 3 9635 417 PGP Fingerprint: 7A2E 0407 C9A6 CAF6 2B9F 8422 C063 5EBB FE1D 66D1
RE: [NEW SPAM FLOOD] www.shopXX.net
On Sun, June 28, 2009 05:38, Cory Hawkless wrote: I agree, wouldn't it be easier to uniformly feed all of these type of URL's though the already existing SA filters. As Jason suggested maybe by collapsing whitespaces? lets redefine how a url is in the first place ? www localhost localdomain www.localhost.localdomain one of them does not work :) spammers more or less just use the first one, so what ? Sounds like the obvious solution to me? Any problems with this? If not how can it be done? just show a working ReplaceTags for spaces, and then all can be solved to make rules with how spaces can rebuild into no spaces, eg in my above example will be . and then sa see the last url and first url imho this is what replacetags does but as long webbrowsers does not work on both, is it a big problem so ? -- xpoint
SA on Windows (XP) with Cygwin
Hello René and anyone else who has run SA on Windows under Cygwin, I've been dabbling a little with this, having not used Cygwin beforehand, and I think I have grasped the basic operational principles of installing/building modules and SA, but it appears it may turn out to be a waste of time and fruitless venture; by that I mean it seems to be a pain or impossible to get various modules working including things like DCC and Razor. Is this indeed the case? I think however René earlier said he had a 'full' install working? Bearing in mind I am only a Windows XP person, whose grasp of 'command line' operations previously went no further than basic .bat files or using 'run' and typing 'cmd' 'ok', I need a relatively guaranteed and specific guide on how to get all the above working, should it in fact be possible. As usual, I've found various bits and pieces on the net on SA under Cygwin, but none fully comprehensive or seemingly up to date. My thinking is; I only want to pursue SA under Cygwin if I can at least achieve a better and more up to date equivalent of the Sourceforge September 2007 package of SAwin32/SAProxy. Otherwise, I may as well revert to using Popfile which is an active project although just a naive bayesian training method. (I think.) I appreciate I will also need to install some other bits to pipe the emails into SA and out to a desktop email client; presumably this will involve something like Procmail or Fetchmail which were mentioned recently. Maybe even a Windows mail server. I do appreciate all this may be considered highly excessive for the sake of filtering personal email, but I like a project as long as I know how to do it specifically and can be assured it will work. For info, here's what that SAwin32/SAProxy package claims/claimed to do: --- SpamAssassin POP3 Proxy for Win32 (SAproxy) v3.2.3.3 Includes SpamAssassin v3.2.3, DCC v1.3.58 and Vipul's Razor v2.84. This tool is a free and powerful spam filter for any Microsoft Windows mail client (Outlook Express, Eudora, Microsoft Outlook, etc.). It supports SSL, but it is for POP3 accounts only and will not work with IMAP, Exchange, Lotus, web-based (such as Hotmail) and other non-POP3 accounts. It includes SpamAssassin (http://spamassassin.apache.org/) and fully supports online spam databases DCC(http://www.rhyolite.com/anti-spam/dcc/) and Razor (http://razor.sf.net/). This build is based on free SAproxy v1.2 and is not associated with Stata Labs (which no longer sells SAproxy Pro). -- So, can an up to date and painless build of that be achieved under Cygwin? If so, which specific Cygwin and CPAN modules (and versions) will work with SA 3.2.5 or at least SA 3.2.4? Thanks in advance; if you feel it is most likely going to be a pain and an unknown quantity, just tell me so, and I won't frustrate myself with attempting this line of thought any further. I appreciate one or two posters have already said / implied this. Lee UK