How about 4 different super tests?  I fail automatically on =?ISO-8859-1?B?, and that accounts for more than 1% of the E-mail coming in to my server, but only a handful of additional catches in what was being missed...no false positives.  I think I've mentioned enough times, the other tests that I would like to have...a BODYTEXT filter that searches just a decoded non-HTML body, a NOTEXT test for nothing but spaces and returns and attachments (that's a key) after decoding and de-HTMLifying, and a TEXTCOUNT marquee test that would allow you to search for amounts of non-HTML decoded body text just just like SUBECTSPACES and BCC, but in reverse (the less there is, the higher the score).  I could catch so much crap with those 40 or so two character gibberish strings, in fact I think it was properly tagging around 10% to 20% of all unique incoming messages today if not more.  That gibberish subject filter is tagging over 5% by itself, and with perfect accuracy so far.  A functional gibberish body filter though would have a reasonable number of false positives (was tagging buy.com links that were shown in displayable text for instance).  I don't of course though expect Scott to rush to my aid here.

I have managed to add though tests for SUBECTSPACES (very effective), COMMENTS (effective) and BCC (just ok), along with some small key word/phrase filters for the body, subject and sender with very good success.  I only saw about 5 definitive false positives today out of around 3000 unique messages, but approximately 150 pieces of spam got through.  I think that could be reduced by as much as half without a measurable impact on the false positives.  If that doesn't work, I'm buying a gun :)

BTW, on Linux, my guru buddy recommends Postfix as the SMTP client and Webmin as the interface.  I don't though dispute Sandy's faith in MS SMTP, and it can be run on the same box as IMail.

Matt




Dan Patnode wrote:
FYI, I pulled this test 3 weeks ago after a email from France came through (or rather didn't) with this subject:

Subject: =?ISO-8859-1?B?RW5qb3kgc3VtbWVyIHVudGlsIGl0cyB2ZXJ5IGVuZCE=?=

There's definitely is a correlation here among spammers, ?B? encoded subjects, disposable domain names, and nothing else in the body of the message.  There has to be a way to bring the 2 or 3 variables togther as a super test.


Dan


On Monday, September 8, 2003 19:05, Matthew Bramble <[EMAIL PROTECTED]> wrote:
  
Use a text filter and add something like:

SUBJECT 40 CONTAINS =?ISO-8859-1?b?

to it.

I tried this all the way down to ust ?b? and a SUBJECT filter
didn't catch it.  The SUBJECT filter also doesn't catch the
decoded text.

I found though that if you use the HEADERS filter, it will
catch this (customize to suit, this will only catch Latin-1
that is base64 encoded, and I can't think of why that would be
necessary, I would think that only other charactersets could
need this):

    HEADERS        10    CONTAINS    ISO-8859-1?B?

Neither the HEADERS filter nor the SUBJECT filter is catching
the decoded form of the text.  The BASE64 test is also not
catching this if it's only in the Subject of the message (I
assume it only does the body/attachments).

The not so funny thing is that I'm getting this now as a part
of those E-mails containing no displayable text.  This guy is
real good at getting through my settings unless he chooses a
bad IP to send from.  I think a few days ago, another person on
this list commented about this same spammer, bringing up the
domains that he is using (common words followed by numbers). 
The only pattern this guys leaves apart from having no text in
the body, is having different country's TLDs listed in the
Received line, the sender, and the reverse DNS.  Here's a copy
of what I just received using this technique (with links
modified):


    
>From - Mon Sep 08 17:36:44 2003
  
X-UIDL: 314612976
X-Mozilla-Status: 0011
X-Mozilla-Status2: 00000000
Received: from gjr.paknet.com.pk [81.128.130.33] by igaia.com with ESMTP
 (SMTPD32-7.13) id A6244F101D8; Mon, 08 Sep 2003 17:35:32 -0400
Date: Mon, 08 Sep 2003 21:35:35 +0000
Message-ID: <[EMAIL PROTECTED]>
X-Mailer: Windows Eudora Pro Version 2.2 (32)
To: [EMAIL PROTECTED]
Subject:
=?ISO-8859-1?B?UmU6T3JkZXIgU2lsZGVuYWZpbCBDaXRyYXRlICBmcm9tIGhvbWUgLSBubyBkb2N0b3IgcmVxdWlyZWQu?=
MIME-Version: 1.0
From: "Shirley Dalton" <[EMAIL PROTECTED]>
Content-Type: text/html
Content-Transfer-Encoding: 8bit
X-Declude-Sender: [EMAIL PROTECTED] [81.128.130.33]
X-Declude-Spoolname: Df62404f101d89e2c.SMD
X-Note: This E-mail was scanned by iGaia Incorporated's E-mail
service (www.igaia.com) for spam.
X-Note: This E-mail was sent from
host81-128-130-33.in-addr.btopenworld.com ([81.128.130.33]).
X-Spam-Tests-Failed: DSN, IPNOTINMX, NOLEGITCONTENT [1]
X-RCPT-TO: <[EMAIL PROTECTED]>
Status: U
X-UIDL: 314612976

<html><body>
<center><!--lfoln42j66--><a
href="" class="moz-txt-link-rfc2396E" href="http://www-dot-payment33dd-dot-com/host/default.asp?ID=omni">"http://www-dot-payment33dd-dot-com/host/default.asp?ID=omni"><img
src="" class="moz-txt-link-rfc2396E" href="http://discountrate2-dot-com/pics/gv1.gif">"http://discountrate2-dot-com/pics/gv1.gif" height="270" width="405"></a></center>
</html></body>
    

Reply via email to