On Sat, 6 Sep 2003, Eric Murray wrote:

> On Fri, Sep 05, 2003 at 09:01:51AM -0700, Major Variola (ret) wrote:
>
> > Can we assume that the spam is generated by regexp-type programs?
> >
> > If so, are there good methods for inferring the regexp from examples,
> > and using this to infer spamfiltering rules?
> >
> > Good project for a machine learning type.
>
> My unscientific observations
> are that there's at least 6 or 8 different formats.
>
> Some are pretty long, i.e.:
>
> >Subject: RE: your medications                             fygbzdwvyyjqvvpnj  uyaecf 
> >ixoimctgdtrn kwqs mxatjr
>
> (that one could be encrypted text)
>
> others are short or have only numbers.
>
> My favorite spam-obfuscation technique is where they break up key words
> with HTML comments, i.e. pen<!--Mary had a little la-->is.
> (that won't show if you are using a mail reader that
> interprets HTML... read the source).

There are many patterns to these emails.

We've got the 'legitimate' spam, and then there is the spam that gets sent
to the list by members who subscribe the list to the spammers.

Then theres emails which are spam sitting in peoples inbox that gets
retransmitted by viruses, worms, and trojans. They may have started out as
spam but they've been hijacked for more nepharious purposes. Usually these
have lots of garbbled text in them.

Then ther are emails like the previous which are just 'snow' to blind the
users.

Another is non-english text. We've been seeing a lot more of these over
the last six months or so.

We've also been seeing lists of words being sent to email addresses. The
purpose is to dictionary attack the various security passwords on the
list.

It wouldn't surprise me one bit considering the human mind if a lot of the
spam we get isn't from non-spammers themselves. Priming the pump so to
speak.


 -- --
      [EMAIL PROTECTED]                            [EMAIL PROTECTED]
      www.ssz.com                               www.open-forge.com


Reply via email to