Re: Freemail problem

2011-02-17 Thread Jeremy Fairbrass


Noel Butler noel.but...@ausics.net wrote in message 
news:1297993593.5473.74.camel@tardis...

/Very Ancient/


On Thu, 2010-06-10 at 18:40 +0200, Jeremy Fairbrass wrote:


Hi, I've noticed what seems to be unexpected behaviour with the Freemail
plugin, which I'm hoping someone can shed some light on.

I'm using SpamAssassin 3.2.5, and the FreeMail.pm plugin v2.001 from
http://sa.hege.li, along with the rules from the 20_freemail.cf file at 
the

same location.



My second question is regarding the reference to
(financediamond[at]gmail.com) in the FREEMAIL_FROM results. That email
address does not appear *anywhere* in the entire message! Not in any of 
the
headers, nor in any part of the body. I've opened up the raw email file 
from

my mail server and searched the entire thing in a plain text editor, and
there is no reference anywhere to 'financediamond' at all. So why is the
FREEMAIL_FROM rule referring to that address? Is it a bug maybe? Could it
perhaps be crossing wires with another email which my SpamAssassin was
scanning at the same time, or something like that??




I am seeing this occasionally myself, including just now, except with 
3.3.1
( hence my search of the mailbox and found this, but only this post) 
somehow
its mixing with addresses from separate emails altogether, this is postfix 
and SA

is called from amavisd-new

Was any suggestions given?

Cheers



I didn't receive any suggestions. I had hoped that when I would eventually 
upgrade to 3.3.x (haven't done that yet), that the problem would go away. So 
I'm sad to hear that it still exists.


- Jeremy 





Freemail problem

2010-06-10 Thread Jeremy Fairbrass
Hi, I've noticed what seems to be unexpected behaviour with the Freemail 
plugin, which I'm hoping someone can shed some light on.


I'm using SpamAssassin 3.2.5, and the FreeMail.pm plugin v2.001 from 
http://sa.hege.li, along with the rules from the 20_freemail.cf file at the 
same location.


Example #1:

Yesterday I spotted the following within the headers of a very spammy spam 
email that I received (total score 23.5 points):


-
Return-path: mr.anthonywalter2...@gmail.com
X-Spam-Report:
*  0.0 FREEMAIL_FROM Sender email is freemail (financediamond[at]gmail.com)
*   (mr.anthonywalter2010[at]gmail.com)
*  (mr.anthonywalter2010[at]gmail.com)
SNIP
From: MR. ANTHONY WALTERmr.anthonywalter2...@gmail.com
-

(I've removed the other headers which aren't relevant here)

As you can see, this spam used mr.anthonywalter2...@gmail.com as the 
envelope sender address (MAIL FROM during the SMTP transaction, which also 
appears in the Return-Path header). And it used the same address in the From 
header of the message too.


My first question is why does (mr.anthonywalter2010[at]gmail.com) appear 
twice within the FREEMAIL_FROM entry inside the X-Spam-Report header? Is it 
there twice because this address was used for both the Return-Path and the 
From headers? In other words, should I expect the FREEMAIL_FROM entry to 
list any freemail address which is used as the envelope sender, *as well as* 
any freemail address used in the From header of the message? I had assumed 
the FREEMAIL_FROM rule only looked at the From header but maybe that's 
incorrect.


My second question is regarding the reference to 
(financediamond[at]gmail.com) in the FREEMAIL_FROM results. That email 
address does not appear *anywhere* in the entire message! Not in any of the 
headers, nor in any part of the body. I've opened up the raw email file from 
my mail server and searched the entire thing in a plain text editor, and 
there is no reference anywhere to 'financediamond' at all. So why is the 
FREEMAIL_FROM rule referring to that address? Is it a bug maybe? Could it 
perhaps be crossing wires with another email which my SpamAssassin was 
scanning at the same time, or something like that??



Example #2:

Here is the FREEMAIL_FROM results from another email that was scanned by my 
SpamAssassin recently. This one was not spam - it was a legitimate email 
sent to a mailing list which is managed by my mail server:


-
X-Spam-Report:
*  0.0 FREEMAIL_FROM Sender email is freemail (munged[at]gmail.com)
*  (munged[at]gmail.com) (munged[at]gmail.com)
*  (munged[at]gmail.com) (munged[at]gmail.com)
*  (munged[at]gmail.com) (munged[at]gmail.com)
*  (munged[at]gmail.com) (munged[at]gmail.com)
*  (munged[at]gmail.com) (munged[at]gmail.com)
*  (munged[at]gmail.com) (munged[at]gmail.com)
*  (munged[at]gmail.com) (munged[at]gmail.com)
*  (munged[at]gmail.com) (munged[at]gmail.com)
*  (munged[at]gmail.com)
From: Joe Citizen mun...@gmail.com
-

I've munged the sender's name and email address, but as you can see, the 
sender's email address was listed multiple times within the FREEMAIL_FROM 
results there (that's the exact same address each time). But the sender's 
address definitely does not appear that many times within the headers and 
body of the email! So this looks very odd to me.


One possible explanation: the sender was sending an email to a mailing list 
on my server. My server then generates one copy of the email for each 
recipient on the mailing list, and sends all of those copies through 
SpamAssassin before sending them out to the recipients. So SpamAssassin is 
scanning multiple copies of the same message at the same time (only the TO 
field is different in each one). So perhaps, somehow, as the FREEMAIL_FROM 
rule is scanning all these messages at once from the same sender, the rule 
is sending its results back to the SpamAssassin engine in such a way that SA 
mistakenly thinks they all relate to the same message rather than to 
multiple messages, and so SA puts all the results into the one FREEMAIL_FROM 
entry in the headers, as shown above. If you know what I mean. However that 
still seems like there's a bug or something, because I've never had a 
similar problem with any other rules at all, even with emails sent through a 
mailing list like this. It's only the FREEMAIL_FROM rule that does this.


Any ideas?

Cheers,
Jeremy 





Re: Spamhaus DBL

2010-03-02 Thread Jeremy Fairbrass
ram r...@netcore.co.in wrote in message 
news:1267506187.16095.11.ca...@darkstar.netcore.co.in...

http://www.spamhaus.org/dbl/
I think sa-folks would have this already in some URIBL rule. What are
the scores you assign for a dbl positive hit ?

I assume my current datafeed would already extend to data access on the
dbl list. I will have to setup my rbldnsd before trying this out.




The new Spamhaus DBL may not be used with current versions of SpamAssassin. 
A new version of SpamAssassin will be released soon which adds support for 
the DBL. Check out the DBL FAQ at 
http://www.spamhaus.org/faq/answers.lasso?section=Spamhaus%20DBL which 
explains why - as quoted below:


**
SpamAssassin 3.3.1 Upgrade *Required*

SpamAssassin 3.3.1 is due to be released shortly and should coincide with 
the release of the DBL. SA 3.3.1 has new code for dealing specifically with 
domain-only URI BLs such as the DBL. To use the DBL you must upgrade to 
3.3.1. SpamAssassin versions prior to 3.3.1 query URI blocklists for URIs of 
both type domain http://abc.domain.tld; and IP http://1.1.1.1;, however 
DBL does not support IP queries; in fact it prohibits them.


DBL should not be used in versions of SpamAssassin prior to 3.3.1 because 
older versions of SpamAssassin make both IP and text queries to URI 
blocklists. The DBL supports only domain (text strings not dotted quads) 
queries. Do not 'hack' versions of SpamAssassin prior to 3.3.1 to add DBL 
support, for example by reusing a SURBL or URIBL configuration for DBL. You 
risk wrongly flagging legitimate email if you make IP queries to the DBL.

**

Also check out the announcement at 
http://www.spamhaus.org/news.lasso?article=655 which goes into further 
detail on this new list.


Cheers,
Jeremy 





Re: iXhash plugin and lists - feedback wanted

2008-08-05 Thread Jeremy Fairbrass

Dirk Bonengel [EMAIL PROTECTED] wrote in message news:[EMAIL PROTECTED]

Hi all,

I'm the author of the iXhash plugin, a piece of code that computes a variety of 'fuzzy checksums' along the lines of the NiXSpam 
project (run by the German IT magazine iX).
I also run two DNS zones (nospam.login-solutions.de,nospam.login-solutions.ag), containing fuzzy checksum data from various spam 
traps.


Now, I'll leave my current job where I had the opportunity to run a dedicated 
server to maintain the lists.
I wonder if it it is worth my while to actually migrate to whole stuff (and expand it to contain data from other sources) or to 
just release a final version of the plugin and call it quits.


I guess this list is the best place to ask those of you who use the plugin for feedback. I'd appreciate any comments and 
information an hit rates, FPs and such


Thanks in advance

Dirk



Hi Dirk,
I've been using iXhash for sometime and find it to be accurate and invaluable. I've never had problems with FPs caused by it. So I 
for one really hope you'll be able to continue with it in one way or another, and adding extra data from other sources also sounds 
great to me!


MfG,
Jeremy 





Re: Regex help

2008-06-16 Thread Jeremy Fairbrass


mouss [EMAIL PROTECTED] wrote in message news:[EMAIL PROTECTED]

Mike Cisar wrote:

Hi All,

Have been trying to write a regex for a custom rule to catch a particular
spam that's been annoying the heck out of me.  


I've got about 6 body rules and have narrowed the problem down to the regex
that tries to catch this part (text appears in SPAM exactly as below,
including case, brackets and quotes).


change . from [POINT] 
  


   /change \\.\ from \\[POINT\]\/



Should be simple, but I'm sure it's a couple days of staring that's making
my brain miss the solution.  Can any of you regex geniuses give me a quick
regex to match that string.

Cheers,
  

Mike 



Is it really necessary to escape the quotation marks? I didn't think it was. 
I'd have thought something like this would work:

/change \. from \[POINT\]/

Cheers,
Jeremy



Re: tflags multiple with mimeheader rules

2008-05-21 Thread Jeremy Fairbrass

Jeremy Fairbrass [EMAIL PROTECTED] wrote in message news:[EMAIL PROTECTED]

Hi all,
Can the tflags multiple setting be used with mimeheader rules? Or only with 
header, body, rawbody, uri, and full tests?

Also, where can I find some further info on how tflags multiple should be used - perhaps with an example or two? I can't find 
anything in the SpamAssassin wiki on this, and the brief description at 
http://spamassassin.apache.org/full/3.2.x/dist/doc/Mail_SpamAssassin_Conf.html isn't much help either.


Cheers,
Jeremy



Can anybody offer some help?! :)

- Jeremy



tflags multiple with mimeheader rules

2008-05-14 Thread Jeremy Fairbrass

Hi all,
Can the tflags multiple setting be used with mimeheader rules? Or only with 
header, body, rawbody, uri, and full tests?

Also, where can I find some further info on how tflags multiple should be used - perhaps with an example or two? I can't find 
anything in the SpamAssassin wiki on this, and the brief description at 
http://spamassassin.apache.org/full/3.2.x/dist/doc/Mail_SpamAssassin_Conf.html isn't much help either.


Cheers,
Jeremy 





triplets.txt

2008-05-08 Thread Jeremy Fairbrass

Hi, could someone kindly tell me what the file triplets.txt is used for, and 
if I need to have it in my rules directory or not?

Cheers,
Jeremy



Re: Starting a URIBL - Howto? [OT]

2008-04-28 Thread Jeremy Fairbrass

Rob McEwen [EMAIL PROTECTED] wrote in message news:[EMAIL PROTECTED]

Marc Perkel wrote:

I was just wondering from those of you who have done it - how to start a URIBL. 
I'm guessing the process (simplified) is:

1) Mine messages for links
2) Subtract out anything matching a fairly large white list

So my first question here is - what do most of you used to mine the links in a message with? Can someone point me in the right 
direction? Also - I'm willing to work with and share data with others who are already doing this.



Marc,

Just like a regular sender's IP dnsbl (aka RBL), the hardest part is not having FPs... in fact, this is probably *harder* for 
URIBLs compared to RBLs. The second hardest part is being able to list spammer's URIs *quickly* (particularly since trying to do 
so exacerbates the first problem.)


The process you described is the best way to start... it is where everyone starts. But many have started with amazing whitelists, 
done what you described, and have failed. It take much more than a great whitelist to make a great blacklist.


In fact, I know someone who frequents these anti-spam lists ...who I consider smarter than either you or me... and I happen to 
consider him the world's foremost authority on how to create and maintain a *great* RBL. (I'm not allowed to mention who he is... 
in this context... but just about everyone reading this would recognize his name... NO, this is NOT Steve Linford... please, no 
questions or guesses about this!)  Anyway, over the past several months... he tried to create a great URIBL and, so far, his URIBL 
falls far short of SURBL and URIBL and ivmURI.


Marc, if I had to make a short list of those who I thought might be able to pull this off... you'd definitely be on the short 
list.


However, don't be discouraged if you come up short and/or if it takes many months... even years... to accomplish what you seek. If 
the guy I described can't do it (at least last I checked...), then believe me, this is NOT an easy task.


I know MUCH about this. I've been one of the admins for SURBL for the past 4+ years. Additionally, I created own URIBL called 
ivmURI, which is now *easily* in the same league as SURBL and URIBL... In fact, ivmSIP is probably even better... at least, 
according to the hit stats and FP stats that some of my users have provided me where all three URI blacklists are compared to each 
other. (Of course, all three lists are indispensable... I use ALL of them in my spam filtering... and ALL 3 catch stuff the other 
2 miss... FOR EXAMPLE: http://invaluement.com/results.txt )


At this time, there is no other publicly available URI blacklist that comes close to SURBL and URIBL and ivmURI. No close 4th 
place. Again, *not* *even* *close*.


I hope this helps and doesn't discourage you. I had a wise college professor tell me big problem, big solution... little problem, 
little solution. Spammer's URIs is a big problem that requires a big solution. Knowing what you're up against in creating a URI 
blacklist might seem discouraging in the short term, but might give you the proper long-term focus and patience you need to really 
pull this off.


Best wishes for your success in this endeavor!

Rob McEwen
(creator of the invaluement.com DNSBLs, ivmURI  ivmSIP)




Hi Rob,
Are your invaluement.com DNSBLs available for us to use? Your http://invaluement.com/results.txt page tells me why I should be using 
it TODAY ;) but I can't find any info about how...!!


Cheers,
Jeremy 





Re: googlemail.com is this a free mail domain

2008-04-24 Thread Jeremy Fairbrass
I think it's also used in Germany. The two domain names function identically, and I even think if someone sends a message to either 
[EMAIL PROTECTED] or [EMAIL PROTECTED], both will reach you - ie. you can use them interchangeably. But whether you can 
officially register for one or the other probably depends on the county you're in.



Jari Fredriksson [EMAIL PROTECTED] wrote in message news:[EMAIL PROTECTED]

I received a spam from googlemail.com
I had assumed googlemail.com was not available for free
domains, Am I wrong. Can I register a googlemail.com id


If I remember it right,googlemail.com is a domain used instead of gmail.com in 
UK.

There already was a gmail in UK...




Re: Need help with bobax rules

2008-04-17 Thread Jeremy Fairbrass
Are Henry's versions of these rules different to what Jack posted below, and if so, where can I find them? I'm still running SA 
3.1.8 (unable to upgrade yet) so I wouldn't receive them if you've pushed them to the 3.2 sa-update.


Cheers,
Jeremy



Justin Mason [EMAIL PROTECTED] wrote in message news:[EMAIL PROTECTED]


for what it's worth, I just pushed Henry's version of Joe's rules into the
3.2.x sa-updates.

--j.

Jack Pepper writes:

Quoting Jeremy Fairbrass [EMAIL PROTECTED]:

 HI Jack,
 Any chance of sharing your rules for this?!

 Cheers,
 Jeremy

Sure:

score BOBAX_GEN_SPAM_2 1.800
header BOBAX_GEN_SPAM_2   ALL =~
/^Message-Id:[EMAIL PROTECTED]/m
describe BOBAX_GEN_SPAM_2   Has Bobax Generated Message-Id, type 2

score BOBAX_GEN_SPAM 1.800
header BOBAX_GEN_SPAM   ALL =~ /^Message-Id:.*EJXVWDA/m
describe BOBAX_GEN_SPAM   Has Bobax Generated Message-Id

One fellow suggested that it might be more efficient to do this:

score BOBAX_GEN_SPAM 1.800
header BOBAX_GEN_SPAM   Message-ID =~ /EJXVWDA/m
describe BOBAX_GEN_SPAM   Has Bobax Generated Message-Id

but I wasn't sure if SA would detect that the incorrect case on the
word message-id and then not realize the test, etc.  Any suggestions?

jp

--
Framework?  I don't need no steenking framework!


@fferent Security Labs:  Isolate/Insulate/Innovate
http://www.afferentsecurity.com







Re: Need help with bobax rules

2008-04-16 Thread Jeremy Fairbrass

HI Jack,
Any chance of sharing your rules for this?!

Cheers,
Jeremy



Jack Pepper [EMAIL PROTECTED] wrote in message news:[EMAIL PROTECTED]
This info popped up on the emerging-Threats list.  I have watched our  
mail servers and have confirmed that it works.


The problem is that my attempts to create Spamassin rules for it never  
fire off.  Can I get some tutelage from the list on creating rules for  
these unique conditions:


Message IDs randomized, but always the same length per field, and  
uses Message-Id instead of Message-ID:


Message-Id: [EMAIL PROTECTED]
Message-Id: [EMAIL PROTECTED]
Message-Id: [EMAIL PROTECTED]
Message-Id: [EMAIL PROTECTED]
Message-Id: [EMAIL PROTECTED]
Message-Id: [EMAIL PROTECTED]
Message-Id: [EMAIL PROTECTED]
Message-Id: [EMAIL PROTECTED]
Message-Id: [EMAIL PROTECTED]
Message-Id: [EMAIL PROTECTED]

Intel from Joe Stewart at  Secureworks.

Message-Id capitalized incorrectly, and EJXVWDA appears in the  
middle of the random prefix:


Message-Id: [EMAIL PROTECTED]
Message-Id: [EMAIL PROTECTED]
Message-Id: [EMAIL PROTECTED]
Message-Id: [EMAIL PROTECTED]
Message-Id: [EMAIL PROTECTED]
Message-Id: [EMAIL PROTECTED]
Message-Id: [EMAIL PROTECTED]
Message-Id: [EMAIL PROTECTED]

Intel from Joe Stewart at  Secureworks.

First group increments over time. Last group is the IP in hex backwards.
Like so:

Message-ID: [EMAIL PROTECTED]
Message-ID: [EMAIL PROTECTED]
Message-ID: [EMAIL PROTECTED]
Message-ID: [EMAIL PROTECTED]
Message-ID: [EMAIL PROTECTED]
Message-ID: [EMAIL PROTECTED]
Message-ID: [EMAIL PROTECTED]
Message-ID: [EMAIL PROTECTED]
Message-ID: [EMAIL PROTECTED]
Message-ID: [EMAIL PROTECTED]

Thanks again to Joe Stewart for the intel!




Any thing that hits is generated by bobax/kraken/oderoor and can be dropped.

jp
--
Framework?  I don't need no steenking framework!


@fferent Security Labs:  Isolate/Insulate/Innovate  
http://www.afferentsecurity.com






Re: MP3 Spam

2007-10-19 Thread Jeremy Fairbrass
No, MIMEHeader works fine with 3.1.x

- Jeremy



Justin Mason [EMAIL PROTECTED] wrote in message news:[EMAIL PROTECTED]

 Martin.Hepworth writes:
 Hmm

 I'm still running 3.1.8..

 I think you need 3.2.x for the MIMEHeader plugin.

 --j.

 Content analysis details:   (7.4 points, 5.0 required)

  pts rule name  description
  -- 
 --
  1.5 HOST_EQ_NL HOST_EQ_NL
  3.0 BOTNET_IPINHOSTNAMEHostname contains its own IP address
   
 [botnet_ipinhosntame,ip=62.163.207.251,rdns=a207251.upc-a.chello.nl]
 -2.6 BAYES_00   BODY: Bayesian spam probability is 0 to 1%
 [score: 0.0064]
  1.6 RCVD_IN_BL_SPAMCOP_NET RBL: Received via a relay in bl.spamcop.net
   [Blocked - see 
 http://www.spamcop.net/bl.shtml?62.163.207.251]
  3.9 RCVD_IN_XBLRBL: Received via a relay in Spamhaus XBL
 [62.163.207.251 listed in zen.spamhaus.org]

 I just bumped the BOTNET_IPINHOSTNAME score so I score above my 5 limit now..

 Don't run RCVD_IN_SORBS_DUL as I found it FP heavy for my environment

 I expect to see mp's in my environment, so that's maybe why bayes was at the 
 opposite end of the score spectrum to you.

 No JM_STORM_MP3 thoughmaybe a 3.1.8/3.2.3 thing, it lint's clean.

 --
 Martin Hepworth
 Snr Systems Administrator
 Solid State Logic
 Tel: +44 (0)1865 842300

  -Original Message-
  From: UxBoD [mailto:[EMAIL PROTECTED]
  Sent: 19 October 2007 09:14
  To: Martin.Hepworth
  Cc: [EMAIL PROTECTED]
  Subject: Re: MP3 Spam
 
  Hmmm, hit okay here Martin :-
 
  X-Spam-Status: Yes, score=27.6 required=10.0
  tests=BAYES_99,BOTNET,CRM114_CHECK,
 
  HELO_DYNAMIC_CHELLO_NL,JM_STORM_MP3,RCVD_IN_BL_SPAMCOP_NET,RCVD_IN_SORBS_D
  UL,
  RCVD_IN_XBL,RDNS_DYNAMIC,TVD_SPACE_RATIO autolearn=unavailable
  version=3.2.3
 
  Regards,
 
  --[ UxBoD ]--
  // PGP Key: curl -s https://www.splatnix.net/uxbod.asc | gpg --import
  // Fingerprint: C759 8F52 1D17 B3C5 5854  36BD 1FB1 B02F 5DB5 687B
  // Keyserver: www.keyserver.net Key-ID: 0x5DB5687B
  // Phone: +44 845 869 2749 SIP Phone: [EMAIL PROTECTED]
 
  - Original Message -
  From: Martin.Hepworth [EMAIL PROTECTED]
  To: [EMAIL PROTECTED]
  Cc: [EMAIL PROTECTED]
  Sent: Friday, October 19, 2007 9:11:38 AM (GMT) Europe/London
  Subject: RE: MP3 Spam
 
 
 
  http://www.solidstatelogic.com/mp3-spam.txt
 
  --
  Martin Hepworth
  Snr Systems Administrator
  Solid State Logic
  Tel: +44 (0)1865 842300
 
   -Original Message-
   From: UxBoD [mailto:[EMAIL PROTECTED]
   Sent: 19 October 2007 09:01
   To: Martin.Hepworth
   Cc: [EMAIL PROTECTED]
   Subject: Re: MP3 Spam
  
   Can you post a copy online Martin ? need a few examples to find the
  common
   elements.
  
   Regards,
  
   --[ UxBoD ]--
   // PGP Key: curl -s https://www.splatnix.net/uxbod.asc | gpg --import
   // Fingerprint: C759 8F52 1D17 B3C5 5854  36BD 1FB1 B02F 5DB5 687B
   // Keyserver: www.keyserver.net Key-ID: 0x5DB5687B
   // Phone: +44 845 869 2749 SIP Phone: [EMAIL PROTECTED]
  
   - Original Message -
   From: Martin.Hepworth [EMAIL PROTECTED]
   To: [EMAIL PROTECTED]
   Sent: Friday, October 19, 2007 9:00:39 AM (GMT) Europe/London
   Subject: RE: MP3 Spam
  
  
   Just tried this on an example we had overnight and it's didn't hit ;-(
  
   --
   Martin Hepworth
   Snr Systems Administrator
   Solid State Logic
   Tel: +44 (0)1865 842300
  
-Original Message-
From: UxBoD [mailto:[EMAIL PROTECTED]
Sent: 19 October 2007 08:45
To: Justin Mason
Cc: users@spamassassin.apache.org
Subject: Re: MP3 Spam
   
Thanks Justin.  Do they all follow the same patterns ?
   
Regards,
   
--[ UxBoD ]--
// PGP Key: curl -s https://www.splatnix.net/uxbod.asc | gpg --
  import
// Fingerprint: C759 8F52 1D17 B3C5 5854  36BD 1FB1 B02F 5DB5 687B
// Keyserver: www.keyserver.net Key-ID: 0x5DB5687B
// Phone: +44 845 869 2749 SIP Phone: [EMAIL PROTECTED]
   
- Original Message -
From: Justin Mason [EMAIL PROTECTED]
To: [EMAIL PROTECTED]
Cc: users@spamassassin.apache.org
Sent: Thursday, October 18, 2007 8:24:35 PM (GMT) Europe/London
Subject: Re: MP3 Spam
   
   
UxBoD writes:
 Does anybody have one of these, or different one, that you could
   upload
somewhere so can do some analysis ?
   
sure: http://taint.org/x/2007/mp3spam.txt
anyway, these rules catch them as far as I can tell:
   
  ifplugin Mail::SpamAssassin::Plugin::MIMEHeader
  mimeheader __CTYPE_STORM_MP3_1 Content-Type:raw =~ /^audio\/mpeg;\n
name=\[a-z]+\.mp3\$/s
  mimeheader __CDISP_STORM_MP3_1 Content-Disposition:raw =~
  /^inline;\n
filename=\[a-z]+\.mp3\$/s
  mimeheader __CTYPE_STORM_MP3_2 Content-Type:raw =~
/^audio\/mpeg;\n\tname=\[a-z]+\.mp3\$/s
  mimeheader __CDISP_STORM_MP3_2 Content-Disposition:raw 

Re: RBL Rules Question

2007-08-03 Thread Jeremy Fairbrass
Try this (for replacing your the three meta rules):

metaRCVD_IN_LRBL_W  (__RCVD_IN_LRBL_W  !__RCVD_IN_LRBL_B)
describeRCVD_IN_LRBL_W  Local RBL Whitelist
tflags  RCVD_IN_LRBL_W  net
score   RCVD_IN_LRBL_W  -7

metaRCVD_IN_LRBL_B  (__RCVD_IN_LRBL_B  !__RCVD_IN_LRBL_W)
describeRCVD_IN_LRBL_B  Local RBL Blacklist
tflags  RCVD_IN_LRBL_B  net
score   RCVD_IN_LRBL_B  7

metaRCVD_IN_LRBL_Y  (__RCVD_IN_LRBL_W  __RCVD_IN_LRBL_B)
describeRCVD_IN_LRBL_Y  Local RBL Yellowlist
tflags  RCVD_IN_LRBL_Y  net
score   RCVD_IN_LRBL_Y  -3

Note: if you put an exclamation mark directly in front of a rule name (eg. 
!__RCVD_IN_LRBL_B) it means if this rule does NOT fire. 
Therefore, the meta rule RCVD_IN_LRBL_W above states if __RCVD_IN_LRBL_W fires 
and __RCVD_IN_LRBL_B does not fire. And the meta 
for RCVD_IN_LRBL_Y obviously works when both __RCVD_IN_LRBL_W and 
__RCVD_IN_LRBL_B have fired. I think it's better to use  rather 
than + in this case.

Cheers,
Jeremy



UxBoD [EMAIL PROTECTED] wrote in message news:[EMAIL PROTECTED]
 Hi,

 I have written the following ruleset for our local RBL server :-

 header  __RCVD_IN_LRBL  
 eval:check_rbl('LRBL','dnsrbl.local.com.')
 tflags  __RCVD_IN_LRBL  net

 header  __RCVD_IN_LRBL_Beval:check_rbl_sub('LRBL', 
 '127.0.0.2')
 tflags  __RCVD_IN_LRBL_Bnet

 header  __RCVD_IN_LRBL_Weval:check_rbl_sub('LRBL', 
 '127.0.0.3')
 tflags  __RCVD_IN_LRBL_Wnet

 metaRCVD_IN_LRBL_W  (__RCVD_IN_LRBL_W + __RCVD_IN_LRBL_B 
 = 1)
 describeRCVD_IN_LRBL_W  Local RBL Whitelist
 tflags  RCVD_IN_LRBL_W  net
 score   RCVD_IN_LRBL_W  -7

 metaRCVD_IN_LRBL_B  (__RCVD_IN_LRBL_W + __RCVD_IN_LRBL_B 
 = 1)
 describeRCVD_IN_LRBL_B  Local RBL Blacklist
 tflags  RCVD_IN_LRBL_B  net
 score   RCVD_IN_LRBL_B  7

 metaRCVD_IN_LRBL_Y  (__RCVD_IN_LRBL_W + __RCVD_IN_LRBL_B 
 = 2)
 describeRCVD_IN_LRBL_Y  Local RBL Yellowlist
 tflags  RCVD_IN_LRBL_Y  net
 score   RCVD_IN_LRBL_Y  -3

 But obviously it will score the whitelist and blacklist the same if the IP 
 address appears in both lists.  How can I say on the 
 meta rule that if it *only* appears in blacklist score -7, and 7 if in 
 whitelist, and if in both use the yellowlist ?


 Regards,

 --[ UxBoD ]--
 // PGP Key: curl -s https://www.splatnix.net/uxbod.asc | gpg --import
 // Fingerprint: C759 8F52 1D17 B3C5 5854  36BD 1FB1 B02F 5DB5 687B
 // Keyserver: www.keyserver.net Key-ID: 0x5DB5687B
 // Phone: +44 845 869 2749 SIP Phone: [EMAIL PROTECTED]


 -- 
 This message has been scanned for viruses and
 dangerous content by MailScanner, and is
 believed to be clean.

 





Re: A rule for empty body and pdf attachment??

2007-08-02 Thread Jeremy Fairbrass
Michael W Cocke [EMAIL PROTECTED] wrote in message news:[EMAIL PROTECTED]
 These blasted PDF spams are driving me mad!  Any ideas for a rule that
 would trip if there's no text in the body, just a PDF attachment ?

 (I'm using the PDFinfo plugin now, but I don't really understand it)

 Thanks!

 Mike-
 --
 If you're not confused, you're not trying hard enough.
 --
 Please note - Due to the intense volume of spam, we have installed
 site-wide spam filters at catherders.com.  If email from you bounces,
 try non-HTML, non-encoded, non-attachments,



If you're using the PDFinfo plugin, you should see a rule called 
GMD_PDF_EMPTY_BODY on those spams - it should fire on any message 
containing a PDF and a blank body. Obviously you can modify that rule's score 
if you want to make it higher, or meta it with other 
rules. Also make sure you're using the latest version of the plugin and the 
associated .cf file from 
www.rulesemporium.com/plugins.htm.

Cheers,
Jeremy 





Re: Help with a multi-line mode rule

2007-07-16 Thread Jeremy Fairbrass
Hi Per,
Actually \n matches a newline. $ matches before a newline, ie. the end of a 
line before the invisible newline itself. Therefore, ^ 
and $ match after and before a newline (\n), respectively.

At least that's my understanding. And this isn't the issue for me. It's 
figuring out how to get multiline mode to work in a rule

- Jeremy


Per Jessen [EMAIL PROTECTED] wrote in message news:[EMAIL PROTECTED]
Jeremy Fairbrass wrote:

 Hi all,
 I hope someone can help me with a rule I'm trying to write. My
 understanding of the multi-line mode, with the /m switch at the end,
 is this: in this mode, the caret (^) and dollar ($) match before and
 after newlines in the string. Is that correct?

Hi Jeremy,

a $ will match a newline, whether before or after something.


/Per Jessen, Zürich






Re: Help with a multi-line mode rule

2007-07-16 Thread Jeremy Fairbrass

Loren Wilton [EMAIL PROTECTED] wrote in message news:[EMAIL PROTECTED]
 Right? If I test this rule using the Regex Coach tool at 
 http://weitz.de/regex-coach/ (I'm on Windows), with the 'm' switch 
 enabled, the rule works fine. But when I test it with SpamAssassin, it 
 doesn't work and I believe it's due to the carat and 
 dollar.

 However I want to specifically specify that the word test must be at the 
 very end of the Subject line - hence, I want to have 
 the $ after it. I also want to specify that the X-Return-Path must be there, 
 which is why I have the rest of the rule the way it 
 is, but that's not the issue.

 What am I doing wrong?

 (Of course in reality I'm not searching for the above strings, I'm trying to 
 catch a particular spam sign, but this is a simple 
 example of the method I'm using)


 Simpler and more efficient technique, and it will handle the headers in any 
 order, not just specifically one after the other:

 header __TEST_SUBJECT =~ /test\n?$/
 header __X_RETURNX-Return-Path:exists
 meta MY_RULE __TEST_SUBJECT  __X_RETURN


 If you only want to search headers, you can use the pseudo header ALL and 
 only search the headers, and not the entire body using a 
 full rule.

 If you really want to match multiple lines in a specific order, you can use 
 /s and then use \n to match the newlines explicitly. 
 (There is a way to get /m to work, but I always have to spend half an hour 
 testing the rule before I figure out the magic trick 
 again.)

Loren

Thanks Loren, I'll give /s a whirl instead, maybe I'll have more luck with that.

I did specifically need to search for multiple lines in that way, rather than 
doing a meta rule, because I actually want to use a 
tagged expression. Sorry I didn't explain that in my first post - but that's 
why I need it all in a single rule, which spans 
multiple lines within the message - ie. I want to search for a certain string 
which appears in more than one place in the message 
(and I'm stuck with 3.1.8 at the moment so I can't take advantage of the new 
features of 3.2 yet...)

I've spent a lot more than half an hour trying to get /m to work, with no joy! 
:)

Cheers,
Jeremy 





Help with a multi-line mode rule

2007-07-14 Thread Jeremy Fairbrass
Hi all,
I hope someone can help me with a rule I'm trying to write. My understanding of 
the multi-line mode, with the /m switch at the end, 
is this: in this mode, the caret (^) and dollar ($) match before and after 
newlines in the string. Is that correct?

I believe this is the correct method for allowing me to use a full rule (ie. 
searching the entire undecoded message) but also 
specifying carets and dollars within the regex, right?

So I think this should mean that I can have some text like this, for example:

Subject: this is a test
From: [EMAIL PROTECTED]
X-Return-Path: [EMAIL PROTECTED]

...and create a rule like the following which should hit on it:

fullMYRULE/^Subject:.* test$(?:\s(?!X-Return-Path).*)+\sX-Return-Path: 
[EMAIL PROTECTED]/m

Right? If I test this rule using the Regex Coach tool at 
http://weitz.de/regex-coach/ (I'm on Windows), with the 'm' switch enabled, 
the rule works fine. But when I test it with SpamAssassin, it doesn't work and 
I believe it's due to the carat and dollar.

However I want to specifically specify that the word test must be at the very 
end of the Subject line - hence, I want to have the 
$ after it. I also want to specify that the X-Return-Path must be there, which 
is why I have the rest of the rule the way it is, but 
that's not the issue.

What am I doing wrong?

(Of course in reality I'm not searching for the above strings, I'm trying to 
catch a particular spam sign, but this is a simple 
example of the method I'm using)

Cheers,
Jeremy 





Re: PDFInfo plugin with SA 3.1.7

2007-07-12 Thread Jeremy Fairbrass
I'm running PDFInfo 0.3 with SA 3.1.8 and it works fine for me - and I'm 
even running it on Windows! :)

Cheers,
Jeremy


  Suhas Ingale [EMAIL PROTECTED] wrote in message 
news:![EMAIL PROTECTED]
  Hello,



  I am trying to run PDFInfo plugin with SA 3.1.7. SA registers the plugin 
successfully but does not scan the PDFs in the emails. According to Dallas 
Engelken (Creator of PDFInfo) , The MIME parser in SA is not seeing a PDF 
attachment on this message.



  Has anyone tried running PDFInfo plugin with 3.1.7 version?








Re: New version of iXhash plugin available

2007-07-05 Thread Jeremy Fairbrass
Thanks Dirk!
I have a question: two of the RBL zones have very similar names - 
nospam.login-solutions.de and nospam.login-solutions.ag. Do they 
belong to the same company, and what are the differences between them? Eg. do 
they both contain exactly the same data (hashes) as 
each other, or are there some differences between them, such that it's 
adviseable to use them both? (I also noted that in your 
latest .cf file, you score each one differently - 4.5 vs. 2.5).

Cheers,
Jeremy



Dirk Bonengel [EMAIL PROTECTED] wrote in message news:[EMAIL PROTECTED]
 Folks,

 I've finally come around to releasing a new version of the iXhash plugin. If 
 you happen to use that plugin, just get the code (now 
 located at http://ixhash.sf.net) and upgrade.
 Normally simply replacing the iXhash.pm file should do. Just make sure you 
 have the version corresponding to your  SA version.
 The new version now uses Net::DNS::Resolver's query method (as opposed to 
 search in the earlier code) and stops computing hashes 
 once it's got a hit.

 For those that don't know what this plugin does: It uses an algorithm 
 developed by Bert Ungerer of the German IT magazin iX (Heise 
 Verlag) to compute fuzzy checksums from (spam) emails and checks them against 
 those hashes I and Heise computed from our spam ( 
 and serve via DNS). In short, this puts it in the league of Pyzor, Razor and 
 DCC. It's certainly no 'German Wunderwaffe' against 
 spam but I think it has its merits.

 If you happen to have some significant spamtrap feed you might also be 
 interested to set up your own hash database to check your 
 production mails against. I added a server program that computes the 
 necessary hashes and stores them in a MySQL table as well as 
 another plugin that sources that table. If you do let me know - I'd be 
 interested in any results.

 Dirk
 





bayes_ignore_header for X-Spam values

2007-07-03 Thread Jeremy Fairbrass
Hi all,
Can someone please advise me: is it good or bad to add bayes_ignore_header 
values in my local.cf file for the X-Spam headers that 
are added by SA? For example:

bayes_ignore_header X-Spam-Status
bayes_ignore_header X-Spam-Level
bayes_ignore_header X-Spam-Checker-Version
bayes_ignore_header X-Spam-Report
bayes_ignore_header X-Spam-Processed

I've seen some installations that do have these values, but I'm not sure why - 
I'd have thought it was good for Bayes to be able to 
learn from those headers. What would happen if I would *not* ignore those 
headers and let Bayes learn from them?

Thanks,
Jeremy 





Re: Custom Rule to catch this

2007-03-08 Thread Jeremy Fairbrass
Strange indeed - not for me - I'm using The Regex Coach from 
http://weitz.de/regex-coach/ which so far always does a perfect job of 
testing regex. Maybe it's wrong on this case - who knows! :)

BTW I'm not sure it's necessary to escape the space character within the 
[square brackets] - I think it's acceptable to just have 
[ ] without the \ inside. Although it doesn't do any harm having it in there 
either...

Cheers,
Jeremy



[EMAIL PROTECTED] wrote in message news:[EMAIL PROTECTED]
 On Thu, 8 Mar 2007, Jeremy Fairbrass wrote:

 I just tested those three rules below, and none of them work with 
 www.superveils . com (ie. having a space both before and 
 after that dot).

 Strange, it matches rule 3 with egrep:

 echo 'www.superveils . com' | egrep 'www[\ ]+?\.([a-z0-9\-\ ]?)+\.[\ 
 ]+(com|net|org)'
 www.superveils . com

 Ofcourse you can add other strange characters which obfuscate the URL like 
 Nigel suggested (like , !, ...)

 K.

 





Re: Vbounce ruleset whitelist_bounce_relays

2007-02-19 Thread Jeremy Fairbrass
Hi Justin,
What exactly is the fix, and where do I find it?

I just installed the VBounce plugin on my server this weekend (for the first 
time), and have the same probs described here - ie. 
although I've added my server to whitelist_bounce_relays in local.cf, I'm not 
getting the MY_SERVERS_FOUND rule firing when I 
deliberately cause my server to generate a bounce message back to me.

One question I do have: does MY_SERVERS_FOUND look for the existance of my 
server's name within the headers of the bounce message 
itself, or only within the body of the bounce message (ie. where the original 
message might be)? In my case, the bounce message body 
does NOT contain the original message nor any reference to my server name, 
however my server name is obviously mentioned in the 
headers of the bounce message itself, and I'd have hoped that this would cause 
MY_SERVERS_FOUND to fire.

Anything I can do to correct this?

Cheers,
Jeremy



Justin Mason [EMAIL PROTECTED] wrote in message news:[EMAIL PROTECTED]

 Matt Kettler writes:
 Steve [Spamassassin] wrote:
  Justin Mason wrote:
  could you post an example of your config and the message you're testing
  with, in full?
  OK in /etc/mail/spamassassin/local.cf

  Received: by mail.mydomain.com (Postfix) id EFBE62E48F; Wed,  7 Feb
  2007 12:57:43 + (GMT)

 Nice.. A Received: header with no from clause.

 My guess is that the whitelist isn't working because it thinks this
 message came from nowhere at all. In an environment where your outbound
 SMTP server is also your MX, all bounce messages you get will be
 received by mail.mydomain.com, but only locally generated bounces will
 come from it.

 Actually, this is definitely a bug :(
 http://issues.apache.org/SpamAssassin/show_bug.cgi?id=5331
 I've added a fix to work around it in SVN.

 --j.
 





Bug with FAKE_HELO_MSN

2007-01-09 Thread Jeremy Fairbrass
Hi all,
I'm not sure if this is a bug with the FAKE_HELO_MSN rule, or if I'm just 
overlooking something...

I just received a legitimate email from MSN.com (to verify an email address 
for MSN Messenger). The email triggered the FAKE_HELO_MSN rule, but I can't 
see why. Here are the 3 Received headers that appeared in the email:


Received: from servera02.blusmtp4.msn.com (servera02.blusmtp.msn.com 
[65.55.238.141])
 by myserver.com (myserver.com [123.123.123.123])
 with ESMTP id md5080742.msg
 for [EMAIL PROTECTED]; Tue, 09 Jan 2007 10:12:33 +0100
Received: from servera03.tk2smtp4.msn.com ([10.20.194.192]) by 
servera02.blusmtp4.msn.com with Microsoft SMTPSVC(6.0.3790.1830);
  Tue, 9 Jan 2007 04:12:07 -0500
Received: from TK2PPBAT3A01 ([65.54.136.164]) by servera03.tk2smtp4.msn.com 
with Microsoft SMTPSVC(6.0.3790.1830);
  Tue, 9 Jan 2007 01:12:06 -0800


As you can see, the host and rDNS are both ending with msn.com - why did the 
rule trigger?

I assume this rule only checks against the most recent Received header, 
right? Or does it check against all Received headers? Regardless, it should 
not have fired even against any of the older Received headers, as far as I 
can tell.

Any comments?

Cheers,
Jeremy 





Re: check_illegal_chars

2006-12-05 Thread Jeremy Fairbrass
Thanks - however I don't know anything about Perl scripts, so unfortunately 
it doesn't help me! :) For example, within EvalTests.pm I can see what 
appear to be four variables:
($self, $header, $ratio, $count)

The $header variable is pretty straight forward, but what's with $self, 
$ratio and $count? What do these mean, and what values could I put in an SA 
rule for them?

I guess I was also hoping to find a list of the actual characters that were 
considered illegal.

Cheers,
Jeremy


Theo Van Dinter [EMAIL PROTECTED] wrote in message 
news:[EMAIL PROTECTED]
 On Thu, Nov 30, 2006 at 06:22:46PM +0100, Jeremy Fairbrass wrote:
 Can someone please let me know exactly what illegal characters are being
 checked for with the eval:check_illegal_chars rules? Can I find a list of
 those characters somewhere?
 Also, what are the meanings of the variables that this rule takes? For
 example:

 You'll want to take a look at EvalTests.pm.  It should answer all your
 questions.

 -- 
 Randomly Selected Tagline:
 Stewie: Ah!  Damn it!  I want pancakes.  God!  You people understand
 every language except English.  Yo quiero pancakes.  Dali mua pancakes.
 Clik clik bloody clik pancakes!
 - Family Guy, Love Thy Trophy
 





check_illegal_chars

2006-11-30 Thread Jeremy Fairbrass
Hi all,
Can someone please let me know exactly what illegal characters are being 
checked for with the eval:check_illegal_chars rules? Can I find a list of 
those characters somewhere?

Also, what are the meanings of the variables that this rule takes? For 
example:

eval:check_illegal_chars('Subject','0.00','2')

...I get the 'Subject' bit, it clearly means that the rule is only gonna 
check the Subject field. Can I put in the name of *any* header I want in 
that part, eg. Received, To, etc etc?

And what do the '0.00' and the '2' variables mean?

Cheers,
Jeremy 





Re: Updated to SA 3.1.3 to get sa-update... But:

2006-11-29 Thread Jeremy Fairbrass
Why does your rule not work? It looks good to me, if you're trying to detect 
a subject consisting of (for example): hi it's John or something. Can you 
give some exact samples of subject lines you're trying to flag?

If this string (hi it's ) is the only thing in those subject fields - 
nothing else at all - then it might be wise to anchor your regex to the 
start and end of the field using ^ and $ as follows:

headerHI_ITS_NAME   Subject =~ /^hi it's +[a-z]+$/i

...That way, you avoid potential false positives.

Cheers,
Jeremy




Simon [EMAIL PROTECTED] wrote in message 
news:[EMAIL PROTECTED]
I was getting these spam emails with the subject Name wrote:, so
 someone suggested i update SA and run sa-update. Which i have and its
 now solved that issue - nice.

 But now im getting subject hi it's Name, does someone have a custom
 ruleset for this spam please? Im trying to write one myself with no
 luck:

 headerHI_ITS_NAME   Subject =~ /\bhi\sit's\s+[a-z]/i
 describe  HI_ITS_NAME   Hi It's Name in Subject
 score HI_ITS_NAME   6.5
 





Re: RBL checks and -lastexternal

2006-11-24 Thread Jeremy Fairbrass
Okay, thanks for the explanation. I was hoping to have a way of whitelisting 
certain servers from all DNSBL tests - but they are servers that are not 
within my control, not my own local server, and thus inappropriate to add 
them to internal_networks. And I don't want to remove my own server from 
trusted_networks as that would have other negative consequences.

Basically, I have some custom rulesets that I want to use to check the 
connecting IP against the zz.countries.nerd.dk countries list. But I don't 
want to check all of the IPs that the emails have passed through - I only 
want to check the IP that connected to my own server - hence, I should 
use -lastexternal, right??!

But at the same time, I'd like to have the ability to whitelist certain 
other servers so that they are not included in this country check. Eg. maybe 
I want to block all emails that come from an IP in China (where the IP is 
the one connecting to me), *BUT* I want to exclude a particular server in 
China that is used by a friend who I trust, for example. How could I do 
that? Well, I guess I could make a meta rule that combines my 
zz.countries.nerd.dk rules with something else that prevents those rules 
from working if the trusted IP is found within the Received header or 
something - but that would be fiddly, and would be a nuisance if I had a 
whole bunch of IPs that I wanted to whitelist. It would obviously be much 
easier if I could simply trust/exclude from testing all the IPs listed in 
trusted_networks.

Any ideas?

Cheers,
Jeremy



Matt Kettler [EMAIL PROTECTED] wrote in message 
news:[EMAIL PROTECTED]
 Jeremy Fairbrass wrote:
 Hi all,
 It says at
 http://spamassassin.apache.org/full/3.1.x/doc/Mail_SpamAssassin_Conf.html#network_test_options
 that when an IP address is added to a 'trusted_networks' entry (eg. in
 local.cf), DNS blacklist checks will never query for hosts on these
 networks.

 However, from what I can see (using SA 3.1.5), if I have a check_rbl rule
 and the set name ends with -lastexternal, then SA will still do a DNSBL
 lookup on the lastexternal IP address even though that IP address is 
 added
 to my trusted_networks. Surely it should not do this?

 Is this correct, and is there any way around it, such that any IP address
 added to trusted_networks is NEVER checked by a check_rbl rule, 
 regardless
 of whether -lastexternal is used or not?


 Technically, that documentation is mistaken, slightly.

 Trusted hosts are immune to MOST DNSBL tests. However in -notfirsthop
 and -lastexternal only members of internal_networks are immune.

 If you really need a host to be immune to ALL dnsbl checks, it needs to
 be in both.

 If you have a server that you operate and want it to be able to receive
 mail from dynamic IPed hosts, make it a member of trusted_networks, but
 not a member of internal_networks. This will cause the lastexternal
 test to apply to the server, not the dynamic hosts, and the server
 itself will not be checked against other RBLs.



 Cheers,
 Jeremy 





Re: RBL checks and -lastexternal

2006-11-24 Thread Jeremy Fairbrass
Yeah I do manage the MTA, but I do still want to pass those emails to 
SpamAssassin for checking - I just don't want SA to run the DNSBL tests 
against those whitelisted IPs, but I do still want SA to run all it's other 
tests against the email, as it might still be spam anyway. All I could do at 
the MTA level, is tell the MTA not to pass the email over to SA at all, 
which is not what I want.

Another example might be: say I wanted to add the server of my ISP to 
trusted_networks. The server doesn't generate spam itself, but it could 
possibly still have spam passing through it to me from elsewhere. This fits 
within the description of the correct usage of trusted_networks at 
http://spamassassin.apache.org/full/3.1.x/doc/Mail_SpamAssassin_Conf.html:

A trusted host could conceivably relay spam, but will not originate it, and 
will not forge header data.

And at the same time, I want to exclude that server from the DNSBL tests, 
including the nerd.dk ones I run. Having the server added to 
trusted_networks means that most of the DNSBL tests won't be run against the 
server's IP, but the -lastexternal tests still will be. I wish there were 
some way of completely whitelisting an IP (at SA level) from all DNSBL 
tests, regardless of -lastexternal etc. I wonder if such functionality will 
be possible with SA 3.20?

As a side note, I think a number of other SA rules could also fire on ham in 
the above scenario - eg. there are some rules in SA that look for a HELO 
name with no dots in it within X-Spam-Relays-Untrusted, such as the 
__HELO_NO_DOMAIN rule. If I were to (for example) add my ISP's server to 
trusted_networks, and another customer of that ISP sent an email to me 
through the ISP's server, most likely this rule (__HELO_NO_DOMAIN) would 
fire if that other user's computer used a single-word machine name with no 
dots in it - know what I mean? And that would cause an FP. Likewise with 
many of the rules in 20_fake_helo_tests.cf which also search for certain 
strings within X-Spam-Relays-Untrusted, and could conceivably hit on ham 
emails passed from an end-user to his own server which I might have added to 
trusted_networks. Right? Wouldn't it be better, therefore, to have those 
rules in 20_fake_helo_tests.cf (and also the __HELO_NO_DOMAIN rule) use 
X-Spam-Relays-External instead of X-Spam-Relays-Untrusted??

- Jeremy



Matt Hampton [EMAIL PROTECTED] wrote in message 
news:[EMAIL PROTECTED]
 Jeremy Fairbrass wrote:
 I want to block all emails that come from an IP in China (where the IP is
 the one connecting to me), *BUT* I want to exclude a particular server in
 China that is used by a friend who I trust, for example. How could I do
 that?

 Do you managed the MTA?  If you do this would be an ideal case for using
 the zz.countries.nerd.dk as a RBL and then whitelist the server at MTA
 level.

 Well, I guess I could make a meta rule that combines my
 zz.countries.nerd.dk rules with something else that prevents those rules
 from working if the trusted IP is found within the Received header or
 something - but that would be fiddly, and would be a nuisance if I had a
 whole bunch of IPs that I wanted to whitelist. It would obviously be much
 easier if I could simply trust/exclude from testing all the IPs listed in
 trusted_networks.


 matt
 





RBL checks and -lastexternal

2006-11-23 Thread Jeremy Fairbrass
Hi all,
It says at 
http://spamassassin.apache.org/full/3.1.x/doc/Mail_SpamAssassin_Conf.html#network_test_options
 
that when an IP address is added to a 'trusted_networks' entry (eg. in 
local.cf), DNS blacklist checks will never query for hosts on these 
networks.

However, from what I can see (using SA 3.1.5), if I have a check_rbl rule 
and the set name ends with -lastexternal, then SA will still do a DNSBL 
lookup on the lastexternal IP address even though that IP address is added 
to my trusted_networks. Surely it should not do this?

Is this correct, and is there any way around it, such that any IP address 
added to trusted_networks is NEVER checked by a check_rbl rule, regardless 
of whether -lastexternal is used or not?

Cheers,
Jeremy 





Re: name-in-subject spammers switch to images

2006-11-21 Thread Jeremy Fairbrass
Where exactly can I find the new RCVD_FORGED_WROTE2 rule you refer to? I 
have RCVD_FORGED_WROTE in my 80_additional.cf file, but I don't have any 
RCVD_FORGED_WROTE2 rule. And yes, I have run sa-update to get the latest 
updates available :)

Cheers,
Jeremy




Tony Finch [EMAIL PROTECTED] wrote in message 
news:[EMAIL PROTECTED]
 SA should do better with these since the new RCVD_WROTE2 rule should
 trigger. They also follow the same forgery pattern with the
 From/To/Message-ID lines.

 Tony.
 -- 
 f.a.n.finch  [EMAIL PROTECTED]  http://dotat.at/
 BAILEY: CYCLONIC BECOMING NORTHWESTERLY SEVERE GALE 9 TO VIOLENT STORM 11,
 OCCASIONALLY HURRICANE FORCE 12 IN SOUTH, DECREASING 7 TO SEVERE GALE 9 
 LATER.
 HIGH OR VERY HIGH. RAIN OR SQUALLY SHOWERS. MODERATE OR POOR.


 From [EMAIL PROTECTED] Mon Nov 20 23:25:17 2006
 Return-Path: [EMAIL PROTECTED]
 Received: from ppsw-7-intramail.csi.cam.ac.uk ([192.168.128.137])
 by cyrus-24.csi.private.cam.ac.uk (Cyrus v2.1.16-HERMES)
 with LMTP; Mon, 20 Nov 2006 23:25:17 +
 X-Sieve: CMU Sieve 2.2
 X-Cam-SpamScore: sss
 X-Cam-SpamDetails: scanned, SpamAssassin-3.1.7 (score=7.654,
 DATE_IN_FUTURE_03_06 2.01, HTML_MESSAGE 0.00,
 RCVD_IN_BL_SPAMCOP_NET 1.33, RCVD_IN_XBL 3.11,
 TVD_FW_GRAPHIC_NAME_MID 1.20)
 X-Cam-AntiVirus: Not scanned
 X-Cam-ScannerInfo: http://www.cam.ac.uk/cs/email/scanner/
 Received: from [61.241.116.236] (port=18036 helo=boomerbible.com)
 by ppsw-7.csi.cam.ac.uk (mx.cam.ac.uk [131.111.8.147]:25)
 with esmtp id 1GmIVc-0005FI-Mw (Exim 4.63) for [EMAIL PROTECTED]
 (return-path [EMAIL PROTECTED]); Mon, 20 Nov 2006 23:25:10 
 +
 Received: from 142.179.156.89 (HELO MAIL.bondars.com)
 by ucs.cam.ac.uk with esmtp (-PC8KV+I09@ R.LW7)
 id [EMAIL PROTECTED]@:[EMAIL PROTECTED]
 for [EMAIL PROTECTED]; Mon, 20 Nov 2006 23:25:05 -0480
 From: May [EMAIL PROTECTED]
 To: [EMAIL PROTECTED]
 Subject: hi abuse
 Date: Mon, 20 Nov 2006 23:25:05 -0480
 Message-ID: [EMAIL PROTECTED]
 MIME-Version: 1.0
 Content-Type: multipart/related;
 boundary==_NextPart_000_000A_01C70D3E.2A9CC8C0
 X-Mailer: Microsoft Office Outlook, Build 11.0.6353
 X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2741.2600
 Thread-Index: Aca6Q*J2A0+681CO=:KTW6=6-1/0)2==

 This is a multi-part message in MIME format.

 --=_NextPart_000_000A_01C70D3E.2A9CC8C0
 Content-Type: multipart/alternative;
 boundary==_NextPart_001_000B_01C70D3E.2A9CC8C0


 --=_NextPart_001_000B_01C70D3E.2A9CC8C0
 Content-Type: text/plain;
 charset=iso-8859-2
 Content-Transfer-Encoding: 7bit

 three mornings  release Monday  stressed-out  activities they  one day a
 week.
 three mornings  A lack of spontaneous would worry if  Gervasio said. when
 they can  in the shuffle,  and ballet for each  three mornings  plenty of
 time
 It says enrichment tools  joy that is a cherished   the report says. and
 ballet for each  joy that is a cherished
 children are plopped in  of Wilmette, Ill.  places to play are scarce, the
 report says. Academy  of Philadelphia, Pennsylvania. discover
 Many parents
 I truly believe
 at the beach  I hope it will have some effect,  with get-smart  prepared
 by two
 children's schedules  neighborhoods  at the group's  I truly believe 
 it's
 chasing butterflies, playing with relate to others and  It says enrichment
 tools  Many parents for looking for  trouble finding buddies
 in the shuffle,  That's a light schedule
 play is a simple
 Ginsburg, the report's lead author and
 successful children. Above all,  they must be  time, it can increase risks
 for  as a requirement  on the floor with  mom and dad --

 with get-smart  part of childhood,  what children  time, it can increase
 risks for  have the resources,   contribute to depression
 weekly, plus T-ball  plenty of time  skills,  front of get-smart
 Here's some soothing   true toys
 For now,
 huge variety of  huge variety of  compared with  plenty of time  super
 parents, I believe this message
 stressed-out  who are free to come
 kids: The American
 and ballet for each
 joy that is a cherished  Gervasio said her
 academy committees for  Here's some soothing  annual meeting in  the
 pressure,  For now,
 activities  the report says.  classes in a
 Noted pediatrician and author  and organized  her kids  activities they
 feel pressure to be  children's schedules develop problem-solving
 they must be  that they're  is more good,
 would worry if
 Social pressures  what children
 over and just play.
 of Wilmette, Ill.  said Gervasio,  obesity. It may even Noted pediatrician
 and author  time, it can increase risks for
 and lots of own thing,  things you can do  Jennifer Gervasio
 Ginsburg, the report's lead author and  obesity. It may even relate to
 others and  her kids' friends, and  Spontaneous,
 the pressure,  old-fashioned playtime.
 activities  Dr. T. Berry Brazelton praised  of Wilmette, Ill.  There is a
 part  overscheduled
 healthy, development  she says, she  videos, enrichment
 of me that  activities can be  develop 

MIMEHeader question

2006-11-17 Thread Jeremy Fairbrass
Hi all,
I have a question about the MIMEHeader plugin: if I have multiple mimeheader 
rules, are they all checked against the same part in a multipart message?

So let me give an example:

Let's say an email has 2 separate mime header sections (perhaps one is TXT 
and the other is HTML, or perhap there are 2 file attachments, or whatever). 
They might look like this:

--=_NextPart_000_0062_01C7099B.069AFD30
Content-Type: image/gif;
 name=Blank Bkgrd.gif
Content-Transfer-Encoding: base64

--=_NextPart_001_0063_01C7099B.069AFD30
Content-Type: text/html;
 charset=iso-8859-1
Content-Transfer-Encoding: quoted-printable


Then let's say I have a couple of mimeheader rules as follows:

mimeheader  __RULE1  Content-Type =~ /image\/gif/
mimeheader  __RULE2  Content-Transfer-Encoding =~ /quoted-printable/
meta  MY_META_RULE  (RULE1  RULE2)

My question is, will the meta rule trigger, or not? Because as you can see, 
only the first mime header section contains Content-Type: image/gif, and 
only the second mime header section contains Content-Transfer-Encoding: 
quoted-printable. So are my two mimeheader rules being run against each 
header section separately from each other, or are they only run against the 
header sections together, and thus BOTH must fire on the SAME header section 
in order for the meta rule to work??

Cheers,
Jeremy 





Re: MIMEHeader question

2006-11-17 Thread Jeremy Fairbrass

Justin Mason [EMAIL PROTECTED] wrote in message 
news:[EMAIL PROTECTED]

 Jeremy Fairbrass writes:
 Hi all,
 I have a question about the MIMEHeader plugin: if I have multiple 
 mimeheader
 rules, are they all checked against the same part in a multipart message?

 So let me give an example:

 Let's say an email has 2 separate mime header sections (perhaps one is 
 TXT
 and the other is HTML, or perhap there are 2 file attachments, or 
 whatever).
 They might look like this:

 --=_NextPart_000_0062_01C7099B.069AFD30
 Content-Type: image/gif;
  name=Blank Bkgrd.gif
 Content-Transfer-Encoding: base64

 --=_NextPart_001_0063_01C7099B.069AFD30
 Content-Type: text/html;
  charset=iso-8859-1
 Content-Transfer-Encoding: quoted-printable


 Then let's say I have a couple of mimeheader rules as follows:

 mimeheader  __RULE1  Content-Type =~ /image\/gif/
 mimeheader  __RULE2  Content-Transfer-Encoding =~ /quoted-printable/
 meta  MY_META_RULE  (RULE1  RULE2)

 My question is, will the meta rule trigger, or not? Because as you can 
 see,
 only the first mime header section contains Content-Type: image/gif, and
 only the second mime header section contains Content-Transfer-Encoding:
 quoted-printable. So are my two mimeheader rules being run against each
 header section separately from each other, or are they only run against 
 the
 header sections together, and thus BOTH must fire on the SAME header 
 section
 in order for the meta rule to work??

 the former.


Okay - so you're saying that the two mimeheader rules will actually run 
separately from each other, on each header section, and thus the meta rull 
WILL trigger? That's actually not how I'd want it to work. Is it possible, 
then, to have a meta rule (or some other method) using the mimeheader rules, 
that will ONLY trigger if both mimeheader rules trigger against the SAME 
header section? ie. all elements searched for by all mimeheader rules, must 
exist within the same header section - is this possible? Or do I have to 
resort to a 'full' rule or something?





Re: Add rbl list to spamassassin 3.0.4 ?

2006-11-16 Thread Jeremy Fairbrass
You can change the score line to this, if you simply want the score to be 
3:

score  PRIVATE_RBL   3.0

Also, make sure that the file you create in your spamassassin directory, has 
the .cf file extension - ie. it should be: 99_Private_Rbl.cf rather than 
simply 99_Private_Rbl

Cheers,
Jeremy




Noc Phibee [EMAIL PROTECTED] wrote in message 
news:[EMAIL PROTECTED]
 Noc Phibee a écrit :

 Hi

 For add a new personnal rbl list, i can use:

 into /etc/mail/spamassassin/
 i add a 99_Private_Rbl
 and into i put:

 uridnsbl  PRIVATE_RBL   sbl.spamhaus.org.   TXT
 body  PRIVATE_RBL   eval:check_uridnsbl('PRIVATE_RBL')
 describe PRIVATE_RBL   Contains an URL listed in the Private Rbl 
 blocklist
 tflags  PRIVATE_RBL   net
 score  PRIVATE_RBL   0 1.094 0 1.639


 it's correct ?
 Why in score, i have 0 1.094 0 1.639 ?
 if i want a score to 3 when the ip are listed, what is the correct score 
 line ?

 Thanks for your help



 a small error, i change sbl.spamhaus.org by rbl.mydomain.com
 





Re: check_rbl and DNSBL lookups

2006-11-16 Thread Jeremy Fairbrass
A further question to this: if I want to disable one of those rules in 
20_dnsbl_tests.cf, do I only need to give a score of 0 (in local.cf) to the 
rule with the check_rbl part, or do I need to give a score of 0 to each of 
the 'sub' rules?

For example, there are three sections to the Spamhaus lookups, as follows:

header __RCVD_IN_SBL_XBL eval:check_rbl('sblxbl', 'sbl-xbl.spamhaus.org.')
describe __RCVD_IN_SBL_XBL Received via a relay in Spamhaus SBL+XBL
tflags __RCVD_IN_SBL_XBL net

header RCVD_IN_SBL  eval:check_rbl_sub('sblxbl', '127.0.0.2')
describe RCVD_IN_SBL  Received via a relay in Spamhaus SBL
tflags RCVD_IN_SBL  net

header RCVD_IN_XBL  eval:check_rbl('sblxbl-lastexternal', 
'sbl-xbl.spamhaus.org.', '127.0.0.[456]')
describe RCVD_IN_XBL  Received via a relay in Spamhaus XBL
tflags RCVD_IN_XBL  net


So if I were wanting to disable them all, should I only need to give a 0 
score to __RCVD_IN_SBL_XBL (ie. the first one), or do I need to give a 0 
score to both RCVD_IN_SBL and RCVD_IN_XBL?

I guess what I'm really wanting to know is, is it possible to give a 0 score 
to any rule starting with a double-underscore (__SOMETHING) in order to 
disable it? As I know that double-underscore rules are kinda special and 
don't normally count for any score in the first place.

Thanks!



Richard Frovarp [EMAIL PROTECTED] wrote in message 
news:[EMAIL PROTECTED]
I am trying to go through and remove some of the DNSBL lookups that are 
being performed. I have found previous posts that state just set the meta 
rule to a score of 0 to disable. I have also found previous posts that 
state only these evals are performing lookups: check_rbl, check_rbl_txt and 
check_rbl_envfrom. And that check_rbl_sub do not perform a lookup, but use 
previous rules. What about check_rbl_accreditor?

 Furthermore, looking in 20_dnsbl_tests.cf I see this:

 header __RCVD_IN_NJABLeval:check_rbl('njabl', 
 'combined.njabl.org.')
 header RCVD_IN_NJABL_DUL  eval:check_rbl('njabl-lastexternal', 
 'combined.njabl.org.', '127.0.0.3')
 header __RCVD_IN_SORBSeval:check_rbl('sorbs', 
 'dnsbl.sorbs.net.')
 header RCVD_IN_SORBS_DUL  eval:check_rbl('sorbs-lastexternal', 
 'dnsbl.sorbs.net.', '127.0.0.10')
 header __RCVD_IN_SBL_XBL  eval:check_rbl('sblxbl', 
 'sbl-xbl.spamhaus.org.')
 header RCVD_IN_XBLeval:check_rbl('sblxbl-lastexternal', 
 'sbl-xbl.spamhaus.org.', '127.0.0.[456]')

 Am I missing something? It seems to me that all of the -lastexternal lines 
 will perform duplicate DNS lookups from the previous line, perhaps just a 
 little bit later. I of course run a caching name server, but it does seem 
 to be an extra query and those lines could be changed into check_rbl_sub.

 Thanks,

 Richard

 





Line wrapping

2006-10-30 Thread Jeremy Fairbrass
Hi all,
I've noticed with SA 3.1.5 that the length of the lines in the X-Spam-Report 
header seems to have reduced, ie. the line length for each rule mentioned 
there is not as long as it used to be, and thus the lines are wrapping more 
often than before. Just in the X-Spam-Report only, the other headers seem 
fine.

Is there any particular setting in SA that controls the length of the lines 
before they wrap? Eg. perhaps something in local.cf that I can use to 
control this? Or is it hard coded into SA, or perhaps it's actually a 
function of my mail server software, and nothing to do with SA at all???

Thanks,
Jeremy 





FORGED_HOTMAIL_RCVD bug??

2006-10-17 Thread Jeremy Fairbrass
G'day everyone,
I received a legitimate email from Hotmail today, which (I believe) 
inappropriately triggered the FORGED_HOTMAIL_RCVD rule in my SpamAssassin 
(version 3.1.5). The email from Hotmail was actually a bounce-back to an 
email sent by one of my users to a Hotmail address - it was bouncing back as 
a no such user error from Hotmail, but I think that's not relevant.

There were only two Received headers in the email from Hotmail, and they are 
as follows (unchanged except for the munging of mydomain.com). The top-most 
Received header was added by my server, and is therefore reliable, as is the 
Hotmail IP stated there - 65.54.246.140. Can anyone tell me why the 
FORGED_HOTMAIL_RCVD rule misfired, and what I might be able to do about 
it?

--
Received: from bay0-omc2-s4.bay0.hotmail.com (bay0-omc2-s4.bay0.hotmail.com 
[65.54.246.140])
 by mail.mydomain.com (mail.mydomain.com [87.230.126.33])
 (MDaemon PRO v9.5.0gm1)
 with ESMTP id md5068214.msg
 for [EMAIL PROTECTED]; Mon, 16 Oct 2006 10:25:51 +0200

Received: from bay0-mc2-f7.bay0.hotmail.com ([65.54.244.47]) by 
bay0-omc2-s4.bay0.hotmail.com with Microsoft SMTPSVC(6.0.3790.1830);
  Mon, 16 Oct 2006 00:52:09 -0700
--

Cheers,
Jeremy 





Re: ZMI

2006-09-13 Thread Jeremy Fairbrass
AFAIK it's currently residing at http://zmi.at/x/70_zmi_german.cf

- Jeremy



[EMAIL PROTECTED] wrote in message 
news:[EMAIL PROTECTED]
 what is the current home of the ZMI (german) ruleset?

 Wolfgang Hamann 





Re: Why are most of my messages EMPTY_MESSAGE

2006-09-06 Thread Jeremy Fairbrass
I've had that problem in the past, and found that it was caused by an error 
with some other rule elsewhere (usually a custom rule I'd written myself 
which had a syntax error in it that I'd overlooked). I'd suggest doing 
a --lint check of your rules, see what it turns up.

- Jeremy



scottjf8 [EMAIL PROTECTED] wrote in message 
news:[EMAIL PROTECTED]

 Using the newest SA with Amavisd

 Most of my messages keep getting hit by EMPTY_MESSSAGE and
 MISSING_SUBJECT when these are plaintext emails with subjects and
 messages

 Where should I start troubleshooting?
 -- 
 View this message in context: 
 http://www.nabble.com/Why-are-most-of-my-messages-EMPTY_MESSAGE-tf2224818.html#a6164890
 Sent from the SpamAssassin - Users forum at Nabble.com.

 





Re: Train from Outlook?

2006-08-24 Thread Jeremy Fairbrass
I use a nifty tool called OLSpamCop to achieve this functionality with my
Outlook. OLSpamCop is an Outlook plugin, it adds a new toolbar to Outlook
and basically allows you to select an email, hit either a spam or ham
button on the toolbar, and OLSpamCop will forward the email to an address
you've specified in the options - a different address depending on which
button you hit. It was designed for sending spam to SpamCop, but can be used
to forward the spam (or ham) to any address you specify, eg. to your mail
server and then to SpamAssassin for learning, eg. if you have set up this
is spam and this is ham receiving email addresses on your server, as my
server (MDaemon) does. When authenticated emails are forwarded to either of
these addresses on my server, it automatically runs the Bayes learning on
them accordingly. Thus this Outlook plugin works perfectly for me.

You can find it at http://www.olspamcop.org/. Oh yeah, and it's freeware...!

Cheers,
Jeremy

---
Christopher Mills [EMAIL PROTECTED] wrote in message
news:[EMAIL PROTECTED]
Tell me something, is there a pluggin for outlook that would allow me to
train spamassassin on the web server?
Eg, messages come in, end up in my Junk Mail folder, can i somehow select
them, and click a button with this 'addin' and have it find our web server
and train spam assassin with the data in my local inbox?  That would be a
very cool addon if someone could develop it.






Re: Any rules for URLs like this?

2006-08-18 Thread Jeremy Fairbrass
I'm not sure it's actually obfuscated though?? It seems to be a valid URL, I
mean in terms of it existing in DNS as-is, and in terms of it working (click
on it and it takes you to the spammer's site). I actually didn't know you
could use [] characters in a domain name, but I guess you can - this one
works anyway. In any case, the question would be whether or not there are,
or should be, any rules to detect a URL with [] characters in it - I think 
it would be pretty easy to write such a rule if necessary, but would there 
be any chance of FPs as a result? I dunno what the RFCs say about the usage 
of such characters in a sub-domain...

Cheers,
Jeremy



Loren Wilton [EMAIL PROTECTED] wrote in message 
news:[EMAIL PROTECTED]
 Do you need rules for them?  It looks like URIBL was able to pick
 it up fine.

 Yes, but I want enough points to push it over the automatic-discard
 threshhold. An extra point or two for that form of obfuscation would
 be welcome (to me, at least).

 I wrote a rule against those sort of things about a month back.  I don't 
 recall just off the top of my head what the masscheck results were, but I 
 don't recall them being real impressive at the time.  Possibly I concluded 
 that the rule wasn't quite correct and was having problems hitting nearby 
 html entities.

Loren 





Re: Advanced regex question - backtracking vs. negative lookaheads

2006-04-26 Thread Jeremy Fairbrass
Good point, you're completely right! Thanks for pointing that out... :)

Cheers,
Jeremy


John Rudd [EMAIL PROTECTED] wrote in message 
news:[EMAIL PROTECTED]

 On Apr 25, 2006, at 6:33 AM, Jeremy Fairbrass wrote:



 /style=[^]+color:blue/



 span style=color:blue; font-size:small; border:0px


 Just a small note, which may be mostly a digression but:

 I don't think the above regex will match that string at all.

 The regex, because it has a + instead of a *, requires at least one 
 character between the  and color:blue ... your string doesn't have that.


 





Re: Advanced regex question - backtracking vs. negative lookaheads

2006-04-25 Thread Jeremy Fairbrass
Thanks guys for the clarifications! My understanding of how regex worked was
the same as Bowie's, ie:
-
 My understanding is that with [^]+ the engine will scan from left to
 right until it finds a quote.  Then, in the context of the previous
 regex, it will start backtracking to find a match for color:blue.
-

I  use the free Regex Coach tool from http://www.weitz.de/regex-coach/ to
test my regex, and it works the way Bowie described above, ie. using
backtracking. In other words, using:

/style=[^]+color:blue/

...the [^]+ causes the regex to go all the way to the closing  character,
then backtracks until it finds the color:blue part. This also agrees with
what is explained at www.regular-expressions.info which I believe is a
reliable guide to Perl regex.

Also, Bowie suggested using laziness instead:

/style=[^]+?color:blue/

But I believe laziness also uses backtracking, so I'm not sure there is
*much* of an advantage of this over the greedy regex shown above. Probably
the main advantage of the lazy version would be if there was little or no
text between the first quote-mark and the color:blue part, and/or lots of
text between color:blue and the last quote-mark, eg:

span style=color:blue; font-size:small; border:0px

...The regex would hit this much quicker using the lazy version than the
greedy version. But I'm not sure if there really is a difference, especially
if I want to be able to hit on SPAN tags that might have more text before
the color:blue OR might have more text afterwards. Probably it's six of
one and half a dozen of the other, right?! Why did David describe the lazy
version as slightly less good than the greedy version?

Incidentally the reason I used [^]+ rather than [^]+ was to prevent it
from using lots of memory if there was no closing quote - as an alternative
to using {1,20}.

In any case, both Bowie and David agree that my first solution using
(.(?!color))+ is a really bad idea, and that was the main thing I wanted to
know! :)

Thanks,
Jeremy




Bowie Bailey [EMAIL PROTECTED] wrote in message
news:[EMAIL PROTECTED]
 David Landgren wrote:
 Bowie Bailey wrote:

 [...]

   An alternative solution would be this:
  
   /style=[^]+color:blue/
 
  This looks better.  It is probably less resource-intensive than
  your previous attempt and is definitely easier to read.  But why
  are you looking for  when you anchor the beginning with a quote?
 
  How about this:
 
  /style=[^]+?color:blue/
 
  This is also non-greedy, so it will start looking for the
  color:blue match at the beginning of the string instead of
  having the + slurp up everything up to the quote and then
  backtracking to find the match.

 The regexp engine doesn't slurp. It just scans from left to right,
 noting I might have to come back here along the way.

 Ok, so slurp was a bit of a simplification.  :)

 My understanding is that with [^]+ the engine will scan from left to
 right until it finds a quote.  Then, in the context of the previous
 regex, it will start backtracking to find a match for color:blue.

 In any case, with the non-greedy quantifier, it will stop looking when
 it finds the first color:blue string instead of continuing to the
 end of the string.

 -- 
 Bowie







Advanced regex question - backtracking vs. negative lookaheads

2006-04-21 Thread Jeremy Fairbrass
Hi all,
I wonder if one of you regex gurus might be able to give me some advice 
regarding the most efficiant way of writing a particular rule

Let's say I want to use regex to search for the phrase color:blue within a 
span tag as in the example below (just a made-up example for the sake of 
this question):

span style=border:0px; color:blue; font-size:small

In this case, the color:blue part is preceeded by some other text 
(border:0px) after the first quote mark, but that preceeding text could in 
fact be anything, and I want to allow for the fact that it could be 
anything.

I've read at http://www.regular-expressions.info that it's best to avoid 
backtracking if possible because that is resource-intensive.

So one possible solution would be the following:

/style=(.(?!color))+.color:blue/

In other words, after the first  (quote mark) it looks for any character 
NOT followed by the word color, and repeats that with the + character, 
until it gets to the actual word color. I believe this results in no (or 
almost no?) backtracking. But I'm not sure if it's resource-intensive 
anyway, because of the negative lookahead - are negative lookaheads 
particularly resource intensive, when compared to backtracking? Is one 
preferable over the other?

An alternative solution would be this:

/style=[^]+color:blue/

But this will certainly involve some backtracking, especially if there is 
even more text after the color:blue but before the closing  character, 
for example the font-size:small text.

So what do you think?! Which way is best, ie. most efficient or least 
resource-intensive?

Cheers,
Jeremy 





Re: Rawbody fooled by line breaks?

2006-04-12 Thread Jeremy Fairbrass
Hi Eric,
Actually the full rules don't ignore HTML at all - they are able to search 
within HTML tags quite fine, and also take into account line breaks, because 
they are run before SA does any decoding of the email. I use a bunch of 
custom full rules for this exact purpose.

From 
http://spamassassin.apache.org/dist/doc/Mail_SpamAssassin_Conf.html#rule_definitions_and_privileged_settings:
The full message is the pristine message headers plus the pristine message 
body, including all MIME data such as images, other attachments, MIME 
boundaries, etc.

In order to take into account line breaks you probably need to use the /s at 
the end of the rule, which enables single-line mode. Eg:
full  IMG_SRC  /img src cid:[0-9]+/is

...Although I don't think this exact rule will actually hit on anything, as 
the HTML will actually take the form of something like this:
img src=cid:223505420@08042006-0FEA
...with the equal sign and quote mark after src, and with not only digits 
but also other characters within the cid part, such as @ or hyphens etc. And 
you also have to take into account other tag attributes such as height, 
width which could exist between img and src. Furthermore, if the email 
was encoded in Quoted-Printable, there will probably look more like this 
(actual example from one of my emails):

IMG height=3D72 =
src=3Dcid:223505420@08042006-0FEA width=3D494=20
border=3D0

Note the extra end-of-line equal-sign character on the first row and 3D or 
=20 bits which are put there by the Quoted-Printable encoding and which 
will not be removed by SA before the full rule is run.

So what I'd do is write a rule like this:

full  IMG_SRC  /img.{1,100}cid:/is

Or perhaps more efficiently, this one which doesn't use any backtracking:

full  IMG_SRC  /img ([^](?!cid))+.cid:/is

I wouldn't bother trying to detect the string after the cid: bit, ie. the 
digits etc, unless you had a particular need to. Simply detecting the 
existance of cid: within the IMG tag is enough to determine the email has 
an embedded/inline image within the HTML.

Hope that helps!

Cheers,
Jeremy


---
Eric Hart [EMAIL PROTECTED] wrote in message 
news:[EMAIL PROTECTED]
Hi folks,

Let's say that I want to recognize this HTML tag in a rawbody rule:
img src cid:[random number]
It's easy to write a rule that recognizes this.  I use rawbody because 
full and body ignore html.

Now suppose  that there's a line break in the html tag.  This is legal, and 
is still recognized by mail client:
img
src cid:[random number]
It's not possible to write a rawbody rule that recognizes this!

The problem seems to be that rawbody looks at the message one line at a 
time.  I won't bore you with every way I've tried to create a rule that 
spans this line break, but none of them have worked.

Has anyone enountered/resolved this issue?

Cordially,

Eric Hart
ehart at npi dot net 





Re: Best way to send spam for learning from OE and Outlook

2006-04-07 Thread Jeremy Fairbrass
I use Outlook 2003 and use a freeware Outlook toolbar called Outlook Spam 
Report Utility, available from http://www.olspamcop.org/download.shtml. 
It's designed to enable the easy forwarding of spam to SpamCop, but can 
easily be modified to forward spam or ham to your own mail server for 
learning, if your mail server has a special email address that you can use 
to send spam/ham samples to. So the toolbar makes it very easy for the 
end-user to forward spam and ham to the server on-the-fly as necessary, 
without needing to set up special rules or use IMAP or anything like that. 
Works well for me anyway... :)

Cheers,
Jeremy


Patrick Sherrill [EMAIL PROTECTED] wrote in message 
news:[EMAIL PROTECTED]
 What is the best way to send spam candidates from Outlook and Outlook 
 Express to spamassassin for learning?
 TIA.
 Pat... 





Using regex with bayes_ignore_header

2006-03-31 Thread Jeremy Fairbrass
Hi, can anyone tell me if it's allowed to use regex with bayes_ignore_header 
in local.cf? I've seen this done here and there by others but don't know if 
it's actually allowed or will cause things not to function properly. For 
example:

bayes_ignore_header X-Spam-\S+

If this *is* allowed, are there any pros or cons to it, eg. in terms of 
performance or anything? I guess the main advantage to it is making local.cf 
a little tidier and smaller if you have lots of similar headers you want to 
use with bayes_ignore_header.

Cheers,
Jeremy 





3.1.x rulesets with 3.0.x

2006-03-27 Thread Jeremy Fairbrass
Hi all,
Is it possible to use the SA 3.1.x rulesets (from 
http://spamassassin.apache.org/full/3.1.x/dist/rules/) on SA 3.0.4? In other 
words, simply downloading the .cf files from that URL and plonking them over 
the top of the existing 3.0.4 rulesets? Would that cause any problems? The 
advantage would be being able to use the new and/or modified rules from 
3.1.x in order to get better scoring results. But I'm not sure if there are 
new types of rule in 3.1.x which would not work with 3.0.x and would cause 
3.0.x to stop working or something. Any advice on this would be appreciated!

Cheers,
Jeremy 





Re: 3.1.x rulesets with 3.0.x

2006-03-27 Thread Jeremy Fairbrass
Okay, thanks anyway for the advice! I'd upgrade in a flash but unfortunately 
I'm not able to - I'm using MDaemon v8 which has SA bundled in such a way 
that it can't be separately upgraded.

Cheers,
Jeremy


Matt Kettler [EMAIL PROTECTED] wrote in message 
news:[EMAIL PROTECTED]
 Jeremy Fairbrass wrote:
 Hi all,
 Is it possible to use the SA 3.1.x rulesets (from
 http://spamassassin.apache.org/full/3.1.x/dist/rules/) on SA 3.0.4?

 No.
 In other
 words, simply downloading the .cf files from that URL and plonking them 
 over
 the top of the existing 3.0.4 rulesets?

 No.

 Would that cause any problems?

 Yes, the stock rules make use of code from EvalTests.pm. The functions 
 available
 through that file are not the same in 3.0.4 and 3.1.0.

 The
 advantage would be being able to use the new and/or modified rules from
 3.1.x in order to get better scoring results.

 So upgrade. You'll not only get the stronger rules, but you'll get the 
 code
 improvements too.

 Often times code improvements (changes to the HTML parser, etc) have a 
 more
 dramatic effect on accuracy than changes to the rules alone.


 But I'm not sure if there are
 new types of rule in 3.1.x which would not work with 3.0.x and would 
 cause
 3.0.x to stop working or something. Any advice on this would be 
 appreciated!

 Cheers,
 Jeremy




 





Re: HTML spam not detected

2006-03-22 Thread Jeremy Fairbrass
Hi Emmanuel,
I have a custom rule which works nicely for me to catch those spams that use 
this HTML trick. I'll send it to you offline as I've heard it's not wise to 
post rules to the list (coz the spammers then see them) :)

Happy to send it to anyone else who asks too...

Cheers,
Jeremy


Emmanuel Lesouef [EMAIL PROTECTED] wrote in message 
news:[EMAIL PROTECTED]
 Hi all,

 I get several spam that are HTML but they are not detected.

 In fact, they use the span mark out such as in this example :

 DIVFONT face=3DArial size=3D2Do you want to span style=3D
 float
 :
 right
  l /spanOspan style=3D
 float
 :
 right
  u /spanVspan style=3D
 float
 :
 right
  u /spanEspan style=3D
 float
 :
 right
  u /spanRspan style=3D
 float
 :
 right
  m /spanPspan style=3D
 float
 :
 right
  b /spanAspan style=3D
 float
 :
 right
  y /spanY for your span style=3D
 float
 :
 right
  j /spanMspan style=3D
 float
 :
 right
 

 How can I modify the rules of spamassassin to deal with it ?

 Thank you.

 --

 Emmanuel Lesouef
 





Re: HTML spam not detected

2006-03-22 Thread Jeremy Fairbrass
Was this one only in plain text, or did it include an HTML part as well? Can 
you give us the full body unaltered? Could be that it's using some other 
type of fancy HTML to make the text look like that.

Cheers,
Jeremy


Emmanuel Lesouef [EMAIL PROTECTED] wrote in message 
news:[EMAIL PROTECTED]
 Emmanuel Lesouef a écrit :
 Hi all,

 I get several spam that are HTML but they are not detected.

 In fact, they use the span mark out such as in this example :


 This one is also not detected :


 
 n V c a n I p i o u e m $ n 105 30 lp   j p y i g l j l k s
 c C g i j a c I i i r s $9 n 9 10 1j   o p r i a l s l a s
 d V d i y a e g k r a a $6 h 9 10 U9   a p y i g l m l v s

 n V c a n I p i o u e m $ n 105 30 lp   j p y i g l j l k s
 c C g i j a c I i i r s $9 n 9 10 1j   o p r i a l s l a s
 d V d i y a e g k r a a $6 h 9 10 U9   a p y i g l m l v

 o V b a x I w i l u w m $ v 105 30 tF   l p w i u l b l a s
 t C x i a a z I z i c s $ i 99 10 8h   r p d i j l n l r s
 s V h i f a p g m r p a $ j 69 1 u5 0  m p u i v l a l f s
 

 This one is very strange, as I cannot understand the meaning of it...

 Anyone got a trick ?

 PS : Thank you Jeremy for the HTML trick. :)

 --

 Emmanuel Lesouef
 





Re: Question about whitelist_from_rcvd

2006-03-22 Thread Jeremy Fairbrass
The wildcard isn't needed, and I doubt it's allowed either. See the info and 
examples at 
http://spamassassin.apache.org/full/3.0.x/dist/doc/Mail_SpamAssassin_Conf.html#whitelist_and_blacklist_options

Specifically, the string at the end of whitelist_from_rcvd which refers to 
the reverse DNS of the host, can either be the full hostname, or the domain 
component of that hostname. In other words, if the host that connected to 
your MX had an IP address that mapped to 'sendinghost.spamassassin.org', you 
should specify sendinghost.spamassassin.org or just spamassassin.org here.

So in your case, [whitelist_from_rcvd [EMAIL PROTECTED] somelist.org] 
would work (without the [] of course). The wildcard is effectively implied.

Cheers,
Jeremy



Frank Bures [EMAIL PROTECTED] wrote in message 
news:[EMAIL PROTECTED]
 -BEGIN PGP SIGNED MESSAGE-
 Hash: SHA1

 There are lists that use various servers for their distributions.
 These servers can be described using wild cards as for instance
 *.somelist.org

 I tried to use such wild cards in local.cf as in

 whitelist_from_rcvd [EMAIL PROTECTED] *.somelist.org

 but the definition does not seem to be working.

 Is the '*' wild card use in whitelist_from_rcvd allowed?

 Thanks


 Frank Bures, Dept. of Chemistry, University of Toronto, M5S 3H6
 [EMAIL PROTECTED]
 http://www.chem.utoronto.ca
 PGP public key: 
 http://pgp.mit.edu:11371/pks/lookup?op=indexsearch=Frank+Bures
 -BEGIN PGP SIGNATURE-
 Version: PGPfreeware 5.0 OS/2 for non-commercial use
 Comment: PGP 5.0 for OS/2
 Charset: cp850

 wj8DBQFEIUWpih0Xdz1+w+wRAjwMAKDiX3vwC4ehE6cDqVfMHpUf65xkPACgkplc
 nw+l3EcIt0HNeNn4kKK7Ulk=
 =Ua27
 -END PGP SIGNATURE-


 





Re: news spam

2006-03-21 Thread Jeremy Fairbrass
What's the difference? Your meta rule is fundamentally identical to Loren's 
rule, is it not?!

Cheers,
Jeremy


[EMAIL PROTECTED] wrote in message 
news:[EMAIL PROTECTED]
Loren Wilton wrote:
 header LW_NONEWSSubject =~ /^Re:\s.*\bnews$/i
...
 The .* should be safe in that regex since a subject isn't very long
 and the things on either side are anchored.

If you're paranoid you could do a couple of meta rules:

header _SUBJECT_STARTSWITH_RE Subject =~ /^Re:\s/i
header _SUBJECT_ENDSWITH_NEWS Subject =~ /\bnews$/i
meta SUBJECT_IS_RE_NEWS (_SUBJECT_STARTSWITH_RE  _SUBJECT_ENDSWITH_NEWS)

-- 
Matthew.van.Eerde (at) hbinc.com   805.964.4554 x902
Hispanic Business Inc./HireDiversity.com   Software Engineer





Re: rules for IP addresses without reverse DNS records?

2006-03-20 Thread Jeremy Fairbrass
Correct me if I'm wrong, but would a rule like the following one of mine not 
do the trick regardless of how the MTA writes the Received header, and be 
less prone (actually not prone at all) to spoofing?

headerJF_NO_PTRX-Spam-Relays-Untrusted =~ /^\[ ip=[^ ]* rdns= helo=/
describeJF_NO_PTRNo reverse lookup for sender IP in 
X-Spam-Relays-Untrusted
scoreJF_NO_PTR0.5

It's simply searching for a blank rdns= string (without quotes of course) 
in the X-Spam-Relays-Untrusted pseudoheader. It should only search the very 
first line in this pseudoheader, ie. the one that relates to the most recent 
untrusted relay as per http://wiki.apache.org/spamassassin/TrustedRelays.

I'm guessing, from what I've learnt at 
http://wiki.apache.org/spamassassin/TrustedRelays, that a blank rdns= 
string, ie. followed directly by a space, indicates a lack of a PTR record?

The reason why I think this would be better than searching within the 
Received header, is that in theory the info in an older Received header 
could be spoofed by the spammer so that it includes the name of your MTA. 
Perhaps this is unlikely, I dunno, but at least using 
X-Spam-Relays-Untrusted means you don't have that risk at all, right??!

Can anyone see any exceptions or issues with doing it this way?

Cheers,
Jeremy


Matthias Fuhrmann [EMAIL PROTECTED] wrote in 
message 
news:[EMAIL PROTECTED]
 On Sat, 18 Mar 2006, Dave Augustus wrote:


 Anyone point me in the right direction?

 I am just thinking of increasing the spam level counter based on whether
 they have a reverse IP address. I have tried to reject these outiright
 based on this criteria but that would cause too many false positives.

 this thread will help you:
 http://www.gossamer-threads.com/lists/spamassassin/users/11783?search_string=Reverse%20DNS%20Check;#11783

 just have a look at the rule named:  MY_NO_PTR

 regards,
 Matthias 





Re: Drug email keeps getting thru

2006-03-15 Thread Jeremy Fairbrass
Something's not right there - the URL mentioned in the spam
(deolich-MANGLED.com without the -MANGLED bit) should have hit on both the
SURBL.org and URIBL.com blacklists, yet I don't see hits for either in the
tests that were flagged for this spam - you only have
BAYES_40,FM_NO_STYLE,HTML_80_90,HTML_MESSAGE.

Also I'd expect at least those tests to give you some score other than 0.0.

I'd first suggest enabling the inline spam report which inserts a more
detailed listing into the headers of each test that triggered and the score
each test received. You can do that by adding the following lines to your
local.cf (check to see if they're not there already):
add_header all Report _REPORT_
report_safe 0

Once you can see the full report in the headers you can then see what score
each test is giving, and that helps with troubleshooting.

For the URIBL lookups to work, you also need the following in local.cf:
local_tests_only 0

The surbl.org rules are included in SpamAssassin already (within
25_uribl.cf), but I don't think the uribl.com rules are - so if you want to
also check the uribl.com blacklists as well, you can simply add the two
rules displayed at http://www.uribl.com/usage.shtml into a new .cf file or
alternatively stick them in your local.cf.

The mangled.cf ruleset won't help with this particular spam, as it's using a
fancy HTML trick to really obfuscate the drug names in such a way that
mangled.cf won't hit on them.

I think the key thing is to get the URIBL lookups working on your system and
also figure out why the tests that *did* trigger on this spam
(BAYES_40,FM_NO_STYLE,HTML_80_90,HTML_MESSAGE) gave a total score of 0.0.
I'm not sure what to check for that, but perhaps someone else can suggest
something. Once that's all solved, I think you'll find most of those spams
are getting nicely filtered out by SpamAssassin! :)

You could also create your own custom rule to filter on the subject of these
spams when they contain spammy words like PhaPOramacy . eg:

headerMY_DRUG_SPAMSubject =~ /PhaPOramacy/i
describeMY_DRUG_SPAMSpam with 'PhaPOramacy' in the subject
scoreMY_DRUG_SPAM2.0

...and score it according to your needs. You can add other spammy subject
words to the above by inserting a pipe | character after PhaPOramacy and
then the additional word, eg:
Subject =~ /PhaPOramacy|word1|word2|word3/i

Lastly, you might find the attached rule of mine useful - it filters against 
the HTML trick used in this particular spam, assuming the HTML code at 
www.yoursummit.com/pharmNews.html is correct. Score accordingly. I can't see 
any reason why a non-spam email would use such HTML code, but I don't have 
any way of testing it against a corpus of spam/ham to check for false 
positives.

Cheers,
Jeremy



Tracey Gates [EMAIL PROTECTED] wrote in message 
news:[EMAIL PROTECTED]
I have URIBL lookups enabled.  I have also increased my score in
mangled.cf.  I have posted the email that I'm receiving at
www.yoursummit.com/pharmNews.html if you'd like to view the actual email
content.  Below is the header of the latest email that I've gotten.  The
names of the drugs are in blue and the dollar amounts are in red along.
I'm still at a loss as to what I need to do to get these stopped.

Here is the output of doing the spamassassin --lint -D:

debug: config: read file /etc/mail/spamassassin/25_uribl.cf

debug: plugin: loading Mail::SpamAssassin::Plugin::URIDNSBL from @INC
debug: plugin: registered
Mail::SpamAssassin::Plugin::URIDNSBL=HASH(0xa96f558)
debug: plugin: loading Mail::SpamAssassin::Plugin::Hashcash from @INC
debug: plugin: registered
Mail::SpamAssassin::Plugin::Hashcash=HASH(0xa95afa4)
debug: plugin: loading Mail::SpamAssassin::Plugin::SPF from @INC
debug: plugin: registered
Mail::SpamAssassin::Plugin::SPF=HASH(0xa95c66c)
debug: plugin: Mail::SpamAssassin::Plugin::URIDNSBL=HASH(0xa96f558)
implements '
parse_config'
debug: plugin: Mail::SpamAssassin::Plugin::Hashcash=HASH(0xa95afa4)
implements '
parse_config'


Here is the Header info:

Received: by yoursummit.com (CommuniGate Pro PIPE 4.3.8)
with PIPE id 2829044; Tue, 14 Mar 2006 04:05:46 -0600
Received: from [81.104.204.233] (HELO gcsincorp.com)
by yoursummit.com (CommuniGate Pro SMTP 4.3.8)
with SMTP id 2829043
for [EMAIL PROTECTED]; Tue, 14 Mar 2006 04:05:38 -0600
Subject: Re: PhaPOramacy news
Date: Tue, 14 Mar 2006 04:04:55 -0600
Message-Id: [EMAIL PROTECTED]
MIME-Version: 1.0
Thread-Topic: PhaPOramacy news
Priority: Normal
Importance: normal
X-MSMail-Priority: normal
X-Priority: 3
Sensitivity: Normal
From: Kanta Bramblett [EMAIL PROTECTED]
To: [EMAIL PROTECTED] [EMAIL PROTECTED]
X-Real-To: Tracey Gates [EMAIL PROTECTED]
X-Mailer: CommuniGate Pro MAPI Connector 1.1.22
X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2800.1106
X-Spam-Checker-Version: SpamAssassin 3.0.2 (2004-11-16) on
yoursummit.com
X-Spam-Level:
X-Spam-Status: No, score=-0.0 required=3.5 tests=BAYES_40,FM_NO_STYLE,
HTML_80_90,HTML_MESSAGE autolearn=no 

Re: encoded spam that got thru

2006-03-13 Thread Jeremy Fairbrass
Hi Eric,
The text there is encoded with base64, which is decoded into the proper 
text by the mail client. SpamAssassin will also decode it before running its 
rules against it, for body or rawbody rules, which means SpamAssassin 
will be able to filter it out whether the text was encoded with base64 or 
was sent as plain text.

Without being able to decode that block of stuff myself and thus see what it 
says, I'd suggest firstly making sure you're using the URIBL/SURBL network 
checks (in case this spam had any web links in it), and also use the SARE 
stock rules at http://www.rulesemporium.com/rules.htm#stocks (you might find 
the other rules on that page useful in general too).

Cheers,
Jeremy


Eric W. Bates [EMAIL PROTECTED] wrote in message 
news:[EMAIL PROTECTED]
I don't even understand how the following message works (let alone how
 to block it).

 It simply has a chunk of what looks like encoded binary; and yet,
 thunderbird renders it as a stock announcement (as I write this, I
 wonder whether the good readers of this list are likely to the ascii
 block, or the rendered version?  view source for me please).  The
 header: Content-Transfer-Encoding: base64 should probably give me a 
 clue.

 How do we filter out spam like this?  This got 0 hits.

 Thanks for your time.

 [snip]
 X-Spam-Checker-Version: SpamAssassin 3.1.0 (2005-09-13) on 
 ace1.vineyard.net
 X-Spam-Level:
 X-Spam-Status: No, hits=0.0 required=5.0 tests=EMPTY_MESSAGE=,
 FM_NO_FROM_OR_TO=,FM_NO_TO=,MISSING_SUBJECT=,NO_RECEIVED=,TO_CC_NONE=
 bayes=0.5 autolearn=failed version=3.1.0

 [snip]

 From: Roxie F. Hankins [EMAIL PROTECTED]
 To: [EMAIL PROTECTED], [EMAIL PROTECTED]
 Subject: Focus Stock Alert
 Date: Sat, 11 Mar 2006 23:33:30 +
 MIME-Version: 1.0
 Content-Type: text/plain
 Content-Transfer-Encoding: base64
 X-Virus-Scanned: by AMaViS-ace1 at Vineyard.NET

 SW4gdGhlIGN1cnJlbnQgb2lsIG1hcmtldCwgc2VsZWN0IHNtYWxsIGVuZXJn
 eSBkZWFscyBhcmUgZmx5aW5nLiAgDQpXaXRoIGdyb3dpbmcgZGVtYW5kLCBz
 aHJpbmtpbmcgc3VwcGxpZXMsIGFuZCBnb3Zlcm5tZW50IHN1cHBvcnQgDQpm
 b3IgZG9tZXN0aWMgZW5lcmd5IHByb2plY3RzLCBpcyB0aGVyZSBhIGJldHRl
 ciBzZWN0b3IgdG8gaW52ZXN0IGluPyANCkhlcmUncyBvdXIgbmV4dCB3aW5u
 ZXI6DQoNCkNvOiBQcmVtaXVtIFBldHJvfF9ldW0gSW5jLg0KU3ltOiBQIFAg
 VCB8XyAgDQpDdXJyZW50bHkgVHJhZGluZyBhdDogJDAuMDIgICAgDQoxIFdl
 ZWtfVGFyZ2V0X1ByaWNlOiAgJDAuMTANCg0KQSBNYXNzaXZlIFBSIENhbXBh
 aWduIGlzIFVuZGVyd2F5IGZvciBGcmlkYXkgaW50byBuZXh0IHdlZWshIQ0K
 U3RhcnRpbmcgYXQgb25seSAyIGNlbnRzIHRoZSBHYWlucyB3aWxsIGJlIHRy
 ZW1lbmRvdXMhIQ0KDQpIVUdFIG5ld3MgY29taW5nIG91dCBmb3IgUCBQIFQg
 fF8uIERpZCB0aGV5IHN0cmlrZSBvaWw/DQpQbGVhc2UgcmVhZCBhbGwgdGhl
 IGxhdGVzdCBQcmVzcyBSZWxlYXNlcyBvbiB0aGUgY29tcGFueS4NCldlIGFk
 dmlzZSBvdXIgcmVhZGVycyB0byBnZXQgaW4gZWFybHkhIFRoaXMgb25lIGlz
 IGdvaW5nIHVwIGZhc3QhDQoNClByZW1pdW0gUGV0cm9sZXVtLCBJbmMuIGlz
 IGEgZGl2ZXJzaWZpZWQgZW5lcmd5IGNvbXBhbnkgZm9jdXNlZCBvbiANCmV4
 cGxvaXRpbmcgdGhlIHZhc3Qgb2lsIGFuZCBnYXMgcmVzZXJ2ZXMgb2YgTm9y
 dGhlcm4gQ2FuYWRhLiBXaXRoIGEgDQpzdHJvbmcgbWFuYWdlbWVudCBhbmQg
 dGVjaG5pY2FsIHRlYW0sIFByZW1pdW0gUGV0cm9sZXVtIHdpbGwgYXBwbHkg
 DQppbm5vdmF0aXZlIHRlY2hub2xvZ2llcyB0b3dhcmRzIHRoZSBkaXNjb3Zl
 cnkgYW5kIGRldmVsb3BtZW50IG9mIGEgDQpkaXZlcnNlIHBvcnRmb2xpbyBv
 ZiBoaWdoIHZhbHVlLCBsb3cgcmlzayBlbmVyZ3kgcHJvamVjdHMuICANCiAg
 ICAgICAgICAgICANCiAgICAgICAgICAqIEdPT0QgTFVDSyAmIFRSQURFIE9V
 VCBUSEUgVE9QICo=


 





Saving pseudoheaders

2006-03-12 Thread Jeremy Fairbrass
Hi all,
Can anyone tell me if it's possible to make SA (3.0.x) save the 
X-Spam-Relays-Trusted and X-Spam-Relays-Untrusted pseudoheaders within the 
actual headers of each email, or at least somewhere else, so I can see what 
they say for each email received? Eg. perhaps there is some setting in 
local.cf or elsewhere which can enable the saving of this info in the 
headers?

Second question: when a non-local, unknown host connects to my server (which 
is running SA), and gives it's HELO name (eg. HELO domain.com), will 
domain.com appear in X-Spam-Relays-Trusted or X-Spam-Relays-Untrusted? I 
find this a little tricky to get my head around, although I have read 
http://wiki.apache.org/spamassassin/TrustedRelays. I understand that this 
situation is unique in that the IP and rDNS of the connecting host *is* 
trusted, because it's connecting to my server which I can trust, but the 
HELO is *not* trusted because the connecting host could obviously say 
whatever it wanted to for the HELO. So does this HELO appear in 
X-Spam-Relays-Trusted or X-Spam-Relays-Untrusted?

http://wiki.apache.org/spamassassin/TrustedRelays is unclear for me because 
it refers to a setup where SA is on a server internal.example.com which is 
protected from the outside world by another local server, dmz.example.com. 
So I guess I'd want to know what would happen if SA was on dmz.example.com 
(the machine that talks to the outside world). How would dmz.example.com 
view notrust.example.com if notrust.example.com were to connect directly to 
dmz.example.com? :)

Thanks,
Jeremy 





Re: more pharmacy woes

2006-03-10 Thread Jeremy Fairbrass
You could also easily filter based on the subject, if it's always something
obvious like Parhamcy news, and perhaps on obvious misspellings like
tabIet, abIets etc (note the i in stead of l). And I don't think it
would be too hard to create a special rule to search for a long string of
individual characters with spaces between them followed by a dollar-sign. :)

- Jeremy



Payal Rathod [EMAIL PROTECTED] wrote in message
news:[EMAIL PROTECTED]
 Hi all,
 I need help in decoding pharmacy spam again. I am getting 100s of them.
 I have attached them at,
 http://pastebin.ca/45108

 Can someone tell how to block these things out?
 With warm regards,
 -Payal








Re: GIF stock spams

2006-02-28 Thread Jeremy Fairbrass
Hi Loren, thanks for the feedback and suggestions! I didn't actually realise 
that the @ symbol had to be escaped - my bad! I'm learning as I go... What a 
pain that rawbody only does one line at a time; but at least now I know this 
for sure - previously I wasn't completely sure about that.


SARE and anybody else would be welcome to use my rules, although I imagine 
they would be able to find a more efficient or less FP-risky way of writing 
them. I haven't done any mass checks with them or anything.


Cheers,
Jeremy


- Original Message - 
From: Loren Wilton [EMAIL PROTECTED]

To: users@spamassassin.apache.org
Sent: Tuesday, February 28, 2006 10:14 AM
Subject: Re: GIF stock spams



Interesting set of rules, they look like they should do fairly well.  I'll
run a masscheck on them in a minute.  If they are decent I'm sure SARE 
would

be happy to include them in the stock spam ruleset if you give permission.

The only thing I see that makes me a little nervous is the unescaped @ in
the first rule.  This is probably working because it isn't followed by an
alphabetic character.  Also you have a few .* or other .+ globs.  It's
always better to do a {0,something} or {1,something} size limit on these 
to

keep them from running away in unexpected conditions.

You are correct that rawbody rules are largely useless, since they won't
allow any multiline checking.  There is flatly no way to check more than 
one

line in a rawbody rule, so you are forced into using full for this sort of
thing.  A long-standing gripe of mine.

   Loren 


Re: GIF stock spams

2006-02-28 Thread Jeremy Fairbrass
Could you kindly explain to me about the @ character and why it needs to be 
escaped, or in what conditions it needs to be escaped? Eg. you seem to imply 
that it only needs to be escaped if followed by an alphabetic character. Is 
that the only rule or are there other occasions when it should be escaped? 
What's the reason behind this anyway? Is the @ symbol perhaps sometimes used 
for other purposes (like the . or ? or + or other special regex characters)? 
I've never read anything before about escaping the @ and I can't find any 
info on this at the moment eg. at sites like 
http://www.regular-expressions.info.


Cheers,
Jeremy


- Original Message - 
From: Loren Wilton [EMAIL PROTECTED]

To: users@spamassassin.apache.org
Sent: Tuesday, February 28, 2006 10:14 AM
Subject: Re: GIF stock spams



Interesting set of rules, they look like they should do fairly well.  I'll
run a masscheck on them in a minute.  If they are decent I'm sure SARE 
would

be happy to include them in the stock spam ruleset if you give permission.

The only thing I see that makes me a little nervous is the unescaped @ in
the first rule.  This is probably working because it isn't followed by an
alphabetic character.  Also you have a few .* or other .+ globs.  It's
always better to do a {0,something} or {1,something} size limit on these 
to

keep them from running away in unexpected conditions.

You are correct that rawbody rules are largely useless, since they won't
allow any multiline checking.  There is flatly no way to check more than 
one

line in a rawbody rule, so you are forced into using full for this sort of
thing.  A long-standing gripe of mine.

   Loren 


Re: GIF stock spams

2006-02-28 Thread Jeremy Fairbrass
Okay I've rewritten the first line of the rule in a way I think is better 
(mind any line breaks)...


full__JF_STOCKSPAM1a/- Original 
Message -[^\n]*\nFrom:[^\n]+\nTo:[EMAIL PROTECTED]@[^\n]+\nSent:[^\n]+\nSubject:[^\n]+\n{5,20}\w+/i


I've exchanged the .* and .+ with [^\n] (negated character class) which I 
read is a better method as it doesn't use backtracking. Although I know what 
you mean about {0,xx} being better than .* in order to prevent the rule from 
running away in unexpected conditions, but I think in this case it's not 
so important because the \n restricts each part of the rule to within one 
line. I mean for example after the From: text, the + will allow for 
unlimited characters to follow the colon but only until the end of the 
line - so it's not really possible for the rule to run away too far in the 
way you mean - right?! :) Correct me if I'm wrong. But I think this is 
useful because the spammer could actually use a decent length of text 
following the colon on each line - eg. after From: and To: and Subject: 
etc - there could be a decent length of text following - so easier to use 
the + until it reaches the end of the line and the \n. Hope I make sense! I 
understand me anyway which I'm sure should count for something..


Cheers,
Jeremy



- Original Message - 
From: Loren Wilton [EMAIL PROTECTED]

To: users@spamassassin.apache.org
Sent: Tuesday, February 28, 2006 10:33 AM
Subject: Re: GIF stock spams



although I imagine
they would be able to find a more efficient or less FP-risky way of

writing

them.


Not necessarily.  Other than the things I mentioned, I don't see anything
particularly scarey about these rules.  We have certainly written rules of
this sort to catch other things.  By preference we'd go for multiple 
rawbody

rules and a meta.  But there are things that you can't catch reliably that
way, and this might be one of them.

No telling if the rules are actually decent until the mass check results
come out.  You can have something that works really well for you, and it
will hit absolutely nothing for anyone else, or it will FP all over the
place.  Usually in the later case you can see what went wrong and make a
variation that will get around most or all the FPs, and sometimes you can
widen a rule to hit more spam and not FP.

   Loren