On Mon, 24 Oct 2005, [EMAIL PROTECTED] whispered secretively:
> I'm not sure what the SA folks think about this now a days.  A while
> back, they removed the checks for MS executables as being spam
> indicators even though the test actually is a very good indicator of
> spam.

That's because it didn't work very well. The new AntiVirus plugin
does a much better job, but note that it is *not* an antivirus plugin
despite the name: it's a suspect-extension-and-content-type detector,
so if your users are in the habit of mailing executables or PowerPoint
documents or things of that nature around, the plugin will cause FPs.

>         Instead, SA is detecting email worms via the Bayesian analysis,
> detecting keywords that match MS executables, even though it doesn't
> do anywhere near as good a job.

That's because there aren't many such keywords.

> Email worms are one of the most dangerous and destructive forms of
> UBE.  They directly lead to open proxies that are used for "regular"
> spam.  IMHO, they should be paid *more* attention to than "regular"
> spam, not less.

The problem is that the properties of worms are totally different to the
properties of spam. Spam is wildly variable but intended to contain
components that are read by human beings, and the vast majority of
SpamAssassin's rules look for things on that basis. Worms are vast lumps
of mostly-invariant binary data: the regex rules, the URIBL system, and
the Bayesian analyzer are mostly useless on them, and that doesn't
really leave very much bar header analysis (and half of those rules are
useless on worms too). SA has *no* facilities for spotting patterns in
big lumps of binary data, let alone automated partial disassembly and
static behavioural analysis routines, unpackers for UPX and OLE
unpackers and so on, like many virus scanners have. There is almost no
overlap between the jobs they have to do, or between the nature of the
emails they trap.

Plus, even with the sa-update system, worms change so fast that, with
SA's regex matching and URIBL rendered useless by the binary-lump nature
of worms, SA would never spot most new worms. (The only reason it spots
most spam is because rules that caught old spam often catch new spam
too.  Rules meant to catch old worms pretty much *never* catch new ones
unless, like the MICROSOFT_EXECUTABLE rule, they're so general that they
could easily catch lots of stuff that isn't wormy as well.)

Plus, worms are often so large that scanning them with SA is
astonishingly inefficient. SA is many, many times slower than a
dedicated tool like clamav and can never do as good a job as one of
them. SA would need *tens of thousands* of individually crafted
anti-worm rules to do as good a job as clamav --- and that's *orders of
magnitude* more rules than SA has right now. It'd become unimaginably
slow and immensely bloated, and would *still* do a bad job.


So even though they're UBE, executable lumps aren't something that SA
can efficiently spot. (Equally, though, sometimes antivirus tools like
clamav start attacking things that perhaps they shouldn't: clamav
catches some phishing scams, so those of us with corpuses have had to
stop it rejecting such mails lest it bias the corpuses, as SA *is*
intended to catch phish.)

-- 
`"Gun-wielding recluse gunned down by local police" isn't the epitaph
 I want. I am hoping for "Witnesses reported the sound up to two hundred
 kilometers away" or "Last body part finally located".' --- James Nicoll

Reply via email to