Re: Macro virus fun

2016-04-07 Thread Matt Garretson
On 4/6/2016 3:23 PM, Alex wrote:
> Can you tell us more about the OLE2 result, and how you obtained it
> from clamav, in hopes I could do something similar with amavis?

IIRC, all you have to do is make sure your clamd.conf includes
these two settings:

ScanOLE2 yes
OLE2BlockMacros yes

Then, according to the clamd.conf manpage, 'OLE2 files with VBA
macros, which were not detected by signatures will be marked as
"Heuristics.OLE2.ContainsMacros".'

Since I call clam from mimedefang, I just pattern-match for that hit
string and act accordingly.

We are getting a bit OT from SA, but hopefully that can help you get going.




Re: Macro virus fun

2016-04-06 Thread Matt Garretson
On 4/5/2016 8:40 PM, Alex wrote:
> These targeted macro viruses are killing us. I hoped someone would
> [...]
> What strategy are other people using to block zero-day macro viruses?


I quarantine these before they get to SA with some logic in mimedefang
that combines the OLE2 result from clamav with bogofilter scores and a
couple arbitary pattern matches that i update as needed.


Re: My new method for blocking spam - example

2016-01-20 Thread Matt Garretson
I am not an expert but it does seem like the main novel thing is how
(and how many) multi-word tokens are generated.  I use have been using
multi-word tokens with bogofilter for years and it does help.  Of course
bogofilter only uses adjacent words -- perhaps OP's way of combining
words could yield an increase in accuracy, at the expense of processing
time.

The stuff about not-matching rather than matching seems like nonsense.

Not to sound mean, but this is not the first time OP has come out with
the latest greatest revolution in spam blocking.  :)  I admire his
dedication, in any case!



Re: Rules Needed to verify bank fraud

2012-08-24 Thread Matt Garretson
In my experience, banks and financial institutions tend to be among the
worst offenders against sane bulk mailing practices.  SPF or DKIM will
be broken or inconsistently applied, and sender/relay domains seem to
vary with the weather.  I think it will be tough to nail down all the
valid domains a bank might use to contact its clients.  You'd think that
banks would care enough to do things right, but in many cases they
really seem not to.

The general technique proposed is effective, but I think that trying to
create and maintain a list like this for more than a handful of banks
will be a hassle at best, and will be highly prone to false positives.

It might still be worth trying, but I just wanted to vent my pessimism. :)



Re: [Q] Adjusting Rule Scores - Which file?

2011-02-17 Thread Matt Garretson
On 2/17/2011 10:51 AM, J4K wrote:
 How could I list the default?


Something like this might get you started:

 grep -R  RDNS_DYNAMIC /var/lib/spamassassin/* | grep -i score




Re: facebook phishing, SPF_PASS

2010-11-19 Thread Matt Garretson
On 11/19/2010 3:13 PM, Michael Scheidell wrote:
 Thought you would be interested, a facebook phishing email (yes, it is, 
 ) with SPF_PASS
 (reminding EVERYONE, SPF IS NOT A SPAM VS HAM INDICATOR AT ALL)


Hi, SPF CAN BE YOUR FRIEND HERE:

 header LOCAL_FROM_FBM  from =~ /\...@facebookmail\.com/i
 score LOCAL_FROM_FBM 50.0
 whitelist_from_spf   *...@facebookmail.com

Of course, Facebook also uses DKIM so the third line above could just as
well be:

 whitelist_from_dkim   *...@facebookmail.com

or even:

 whitelist_auth   *...@facebookmail.com

So in this case, SPF isn't a necessity, but it certainly works. I do
similar things in various combinations for many commonly-forged domains.

YMMV...


Re: facebook phishing, SPF_PASS

2010-11-19 Thread Matt Garretson
On 11/19/2010 4:22 PM, Michael Scheidell wrote:
 On 11/19/10 4:17 PM, Matt Garretson wrote:
 whitelist_from_spf   *...@facebookmail.com
 ah, not if you have dns issues.  if you have dns issues, spf and/or dkim 
 will fail and legit email will not pass!

True, perhaps, but a *lot* of things will stop working if you have DNS
issues.  :)

 reason I mention it the first time, was one of my facebook_forgery rules 
 looked for spf_pass (didn' t whitelist it!) but didn't add the 5 points 

Yes, you're right; SPF_PASS on its own isn't much of a help.


Re: facebook phishing, SPF_PASS

2010-11-19 Thread Matt Garretson
On 11/19/2010 5:03 PM, Michael Scheidell wrote:
 with SPF, it could be the senders dns servers, or if they use includes, 
 the dns servers for that side, so, its dangerous to add +50 points, say, 
 and then use spf/dkim or auth to whitelist.


You do have a valid point, but I'm not too worried about it 
myself, since I use this method only for big domains which 
are unlikely (IMO) to frequently have the type of DNS failures 
you speak of.

Hmm, I wonder if you could protect against DNS failures with 
something like:

 meta __LOCAL_GOT_SPF 
(SPF_PASS||SPF_NEUTRAL||SPF_FAIL||SPF_SOFTFAIL||SPF_HELO_PASS||SPF_HELO_NEUTRAL||SPF_HELO_FAIL||SPF_HELO_SOFTFAIL)
 header __LOCAL_FROM_FBM1  from =~ /\...@facebookmail\.com/i
 meta LOCAL_FROM_FBM  ( __LOCAL_FROM_FBM1  __LOCAL_GOT_SPF )
 score LOCAL_FROM_FBM 50.0
 whitelist_from_spf   *...@facebookmail.com

My idea is that, in the case of DNS failures or timeouts while 
looking up SPF, __LOCAL_GOT_SPF would be false (I think), thus
preventing the 50.0 penalty.  And in the normal case where DNS
is okay, the penalty and whitelisting would function as before.

Would that work, or is it crazy?


 clients complain of course, if you miss one spam, and complain, of 
 course if you block one legit email.

Yes, that's what makes our jobs so interesting.  :)




Re: Match returned message headers on any NDR

2010-04-14 Thread Matt Garretson
On 4/14/2010 2:23 PM, Kris Deugau wrote:
 I'm looking for a way to match on that original-message content - after 
 all, that's the real spam payload;  the rest of the message is perfectly 
 legitimate.


Despite conventional wisdom to the contrary, I have been training Bayes
on bounces (both spam and ham) for years with at least semi-decent
results when it comes to backscatter. That'd be one potential way to get
at the original content (when it's available). But I'd advise against
doing it blindly.

NB: For historical reasons, I use bogofilter rather than SA as my
Bayesian engine.


Re: FREEMAIL_ENVFROM_END_DIGIT score

2010-03-30 Thread Matt Garretson
On 3/29/2010 3:31 PM, Michael Scheidell wrote:
 WAY too many gmail and hotmail and yahoo accounts out there, and they 
 HAVE TO END IN DIGITS.so, FREEMAIL-ENVFROM_END_DIGIT is redundant with 
 FREEMAIL.


Agreed. My data point FWIW: since yesterday, FREEMAIL_ENVFROM_END_DIGIT
here has hit on 2304 hams and 91 spams. This ratio may be a little high
due to certain characteristics of our email, but I reduced the score to
-0.01 some time ago.


Re: Pathological messages causing long scan times

2010-03-18 Thread Matt Garretson
On 3/18/2010 5:15 PM, Kris Deugau wrote:
 Here's one pretty much guaranteed to peg a CPU core for ~130 seconds (or 
 more):
 
 http://pastebin.com/2ssy2YEk


Interesting. I see the same thing as you on that message. There's a 
two-minute gap between these two debug lines:

 rules: ran body rule __F_LARGE_MONEY_2 == got hit: 00 million
 rules: ran body rule __SEEK_FRAUD_JFMEJI == got hit: Insurance premium 
and Clearance Certificate Fee

One CPU is mostly pegged during that period. Thinking it had something 
to do with JM_SOUGHT_FRAUD_3, I removed that and __SEEK_FRAUD_JFMEJI,
with no improvement. I don't know where else to look, aside from 
trial-and-error disabling of rules.



Re: Pathological messages causing long scan times

2010-03-18 Thread Matt Garretson
On 3/18/2010 5:56 PM, Matt Garretson wrote:
 On 3/18/2010 5:15 PM, Kris Deugau wrote:
 Here's one pretty much guaranteed to peg a CPU core for ~130 seconds (or 
 http://pastebin.com/2ssy2YEk
 
 Interesting. I see the same thing as you on that message. There's a 
 two-minute gap between these two debug lines:


Looking in more detail at the debug output, I see this 
towards the end (after the delay):

 async: select found 2 responses ready (t.o.=0.0)
 async: completed in 120.380 s: URI-A, A:ns1.refactoring.lt.
 dns: providing a callback for id: 39923/147.36.61.92.zen.spamhaus.org/A/IN
 async: starting: URI-DNSBL, DNSBL:zen.spamhaus.org.:147.36.61.92 (timeout 
15.0s, min 3.0s)
 async: completed in 120.380 s: URI-A, A:ns4.aleja.lt.
 dns: providing a callback for id: 34683/6.12.79.77.zen.spamhaus.org/A/IN
 async: starting: URI-DNSBL, DNSBL:zen.spamhaus.org.:6.12.79.77 (timeout 15.0s, 
min 3.0s)
 async: queries completed: 2, started: 2
 async: queries active: URI-DNSBL=2 at Thu Mar 18 17:22:45 2010
 dns: harvested completed queries


It looks like a dns call (or two?) for URI-A took 120 seconds to return.
Is that a mere coincdence, or could that be causing a spin of some sort?

I can understand a delay caused by slow DNS, but consuming a core seems
strange.

-Matt


Re: Pathological messages causing long scan times

2010-03-18 Thread Matt Garretson
On 3/18/2010 6:06 PM, Matt Garretson wrote:
 It looks like a dns call (or two?) for URI-A took 120 seconds to return.
 Is that a mere coincdence, or could that be causing a spin of some sort?


FWIW, strace shows spamassassin doing this about twice a second 
(with varying arguments) during the two-minute delay:

 brk(0x69df000)  = 0x69df000
 mremap(0x7fc9756db000, 1298432, 1302528, MREMAP_MAYMOVE) = 0x7fc9756db000
 mremap(0x7fc9756db000, 1302528, 1306624, MREMAP_MAYMOVE) = 0x7fc9756db000
 []




Re: Newest spammer trick - non-blank subject lines?

2010-02-11 Thread Matt Garretson
On 2/11/2010 8:08 AM, Per Jessen wrote:
 The only minor issue I see is that a lot
 of people don't understand NDRs (or can't be bothered to try to).


True.  Also, a lot of mail relays mangle NDR's beyond usability.


Re: Spam from compromised web mails

2009-12-15 Thread Matt Garretson
On 12/15/2009 9:31 AM, The Doctor wrote:
 On Tue, Dec 15, 2009 at 12:55:00PM +0530, Rajkumar S wrote:
 Occasionally I receive mail from compromised web mails asking user
 name and password from my users. The source IPs are usually clean (as
 they are legitimate mail servers) and do not catch any ip based rules.


Do you use Bayes?  Bogofilter (another bayesian filter) catches 
those here.  The one you posted scored 0.94 here and would have
been dropped.


Re: Spam from compromised web mails

2009-12-15 Thread Matt Garretson
On 12/15/2009 10:37 AM, Yet Another Ninja wrote:
 even using site wide, autolearning will help your detection a LOT.
 Don't underestimate it...


Heartily agreed. Site-wide bayes here (single 
database for 2000+ users) catches 40% of the spam 
here.  It could certainly catch more, but the first 
55% is caught by clamav/sanesecurity first.  (This 
leaves only the last 5% to get scooped up by SA.)


OT: Re: NOT really about Unhindered Pharma Spam

2009-11-30 Thread Matt Garretson
Chris Owen wrote:
 Why anyone replies to this guy about anything is beyond me.
 Adding him to a kill file doesn't do much good when you still 
 see the other half of the argument. 


+1

If you must feed the trolls, please at least don't quote them.




Re: HABEAS_ACCREDITED SPAMMER

2009-11-24 Thread Matt Garretson
Daniel J McDonald wrote:
 Although these don't all appear to be business related, very few would
 be marked as spam without the HABEAS_ACCREDITED bonus.
 First, the suspicious ones:
 [snip]


FWIW, a good number of those in your list I'm pretty sure 
are legit opt-in newsletters (term used loosely... they
mainly consist of ads and special offers).  Sure, the're 
stupid and ultimately useless from my point of view, but 
AFAICT they are sent only to people who've requested them 
(at least in my experience, with my users).

Obviously every admin has to decide what to block and not
to block, but I just wanted to add a data point. I try 
not to block stuff my users have signed up for, as inane
as the messages may be (to me).




Re: HABEAS_ACCREDITED SPAMMER

2009-11-24 Thread Matt Garretson
Matt Garretson wrote:
 FWIW, a good number of those in your list I'm pretty sure 
 are legit opt-in newsletters (term used loosely... they
 mainly consist of ads and special offers).  Sure, the're 


Followup to myself: I have no opinion on the HABEAS issue,
but a couple years ago I decided to disable the rules 
altogether, and still don't really see a need to score 
either way on the accreditation.



Re: there goes the uri scripts..

2009-11-02 Thread Matt Garretson
Bernd Petrovitsch wrote:
 Think about domain names which (ab)use IDN to generate a very similar
 text strings (read: glyphs) (especially with the default font in our
 beloved monopoly-OS) to serious ones.


Good point.  It will be fun when grandma loses her glasses and
clicks on a link to ämazon.com  or  þankofamerica.com  




Re: Looking for list of bank domains

2009-03-30 Thread Matt Garretson
Marc Perkel wrote:
 I'd like to get a more complete list of banks or bank like institutions
 and sites where hackers are trying to steal passwords to log into
 people's accounts. Here's my small list. Like to get more. I might set


What about webmail sites that people phish for?  And social 
networking sites?  And online stores?  And...

Trying to maintain a list of phishing targets will eventually 
converge with a list of all web sites.  :)

Why not just let bayes figure out which URLs appear most often 
in spam?  Maintaining a list by hand seems like a lot of effort.

-Matt


Re: Using SpamAssassin for just the Bayesian filtering?

2009-03-24 Thread Matt Garretson
Randy J. Ray wrote:
 filtering on other content, filtering that isn't the same as spam-testing. In 
 a 
 nutshell, we currently use the bogofilter application to classify messages, 
 and invoke it with different word-list files to represent different filtering 
 requirements. But this isn't going to scale well for us as written, and I'm 
 the 


If you use sendmail, then consider doing everything from within mimedefang.
You can filter and molest messages as much as you'd like, with simple perl 
code. It'd be a very general solution, as you require.

Here, from within mimedefang, every message that reaches DATA phase goes 
through clamav, bogofilter (single word list), and then SpamAssassin if 
bogofilter didn't give a certain enough score.  Plus there's a bunch of 
other weird custom filtering going on. :)

I don't know how high you need to scale things, but the above works fine on
a smallish to medium scale (~200k messages a day). Adding more BF wordlists
probably wouldn't be a problem, given enough memory.




Re: system response message backlash from spam messages

2009-02-12 Thread Matt Garretson
Ned Slider wrote:
 how much faith 
 do you place in a mail admin deploying SPF _AND_ bouncing messages on 
 SPF failure when they can't even address the issue that their servers 
 are responsible for the backscatter problem 


I think that you may be assuming too much about the way other people
run their mail servers. In any case, the backscatter received at my
site dropped off substantially once we published SPF records. My
faith, or lack of it, doesn't seem to matter too much.

Just another anecdotal data point in the graph of life...

-Matt


Re: Regular expression help

2009-01-21 Thread Matt Garretson
John Hardin wrote:
  On Wed, 21 Jan 2009, rje...@vzw.blackberry.net wrote:
 Didn't we already do this?


Hopefully it's just an old message that was stuck 
in a blackberry queue somewhere.  :)

 


Re: Temporary 'Replacements' for SaneSecurity

2009-01-14 Thread Matt Garretson
Is there any way that a more distributed method of delivering
updates could be more resistant to DDOS attacks?  E.g.
trackerless bittorrents (DHT), or something along those lines?

Just wondering in general


Re: Rule to catch PO#

2008-12-04 Thread Matt Garretson
This thread is getting ridiculous.  Just use

Subject =~ /po.*\d+/i

To avoid losing millions of dollars, surely they can put
up with a couple of porn and impotence spams.   :-)


Re: Rule to catch PO#

2008-12-02 Thread Matt Garretson
Ray Jette wrote:
 PO random #s
 POrandom #s
 PO# random #s
 PO#random #s
 PO # random #s
 PO #random #s


Try:

  Subject =~ /PO ?\#? ?\d+/i

If you don't need case insensitivity, remove the trailing 'i'.




Re: UPS / FedEx spam with virus attached

2008-08-20 Thread Matt Garretson
Bob Pierce wrote:
 Of course the zip attachment contains a virus, and ClamAV does not seem
 to be catching that either.


At my site, ClamAV has been catching them as Email.Trojan.GZC for 
some time.  You might want to check your ClamAV patterns and/or config.

For newer ones that Clam doesn't yet catch, MIMEdefang might be an option
if you use sendmail.  filter_bad_filename() is the applicable function.

-Matt


Re: VBounce ruleset

2008-05-14 Thread Matt Garretson
Aaron Bennett wrote:
 production environment -- do you see them working with the default 
 scores, or have you tweaked them at all? 


I've set up a meta rule which adds more to the score if either
ANY_BOUNCE_MESSAGE or VBOUNCE_MESSAGE hit.  I also have custom
rules that try to decrease the score if it looks like a legit
bounce, based on local criteria.  This is on top of the checking
VBounce does.  Presently this setup catches a good portion of 
the backscatter, without appreciable FPs.



clear_headers does not remove X-Spam-Report

2004-09-24 Thread Matt Garretson
With SA 3.0, using clear_headers in local.cf does not prevent the
X-Spam-Report: header from being inserted into spam messages.  Is this
a bug or a feature?   Below is my local.cf.

### +++
required_score 8.0
clear_headers
report_safe 0
use_dcc 0
use_pyzor 0
use_razor2 0
dns_available yes
use_bayes 0
lock_method flock
fold_headers 0
envelope_sender_header Return-Path
use_auto_whitelist 0
### ---

Thanks,
-Matt