Dan,

There's a decent way around that. You can set the test in the Config file for a solid weight, not score each filter test incrementally, and then provide a list of negative tests that would offset the test. So if there is some sort of ISO tagging of this Japanese stuff, you can find that code and defeat the test from running. Same goes for other languages.

I just got my first false positive out of 200 catches. This was from Korea but written in English (still encoded though). There are two clues in the headers as to how to defeat the test:

Subject: [22] =?euc-kr?B?R2VuZXJhbCBJbnF1aXJ5IGZvciBzbm93bW9iaWxl?=
Content-Type: text/html; charset=euc-kr

You could probably do something like the following (suggested replacement for the original filter if you are using it):



GIBBERISHSUB filter C:\IMail\Declude\Filters\GibberishSub.txt x 5 0

# The following defeats the test if it finds the subject is not sent as ASCII

SUBJECT -5 CONTAINS ?b?

# Small list of letter combinations not found in a basic dictionary.

SUBJECT        0    CONTAINS    qb
SUBJECT        0    CONTAINS    qc
SUBJECT        0    CONTAINS    qd
SUBJECT        0    CONTAINS    qe
SUBJECT        0    CONTAINS    qf
SUBJECT        0    CONTAINS    qg
SUBJECT        0    CONTAINS    qh
SUBJECT        0    CONTAINS    qi
SUBJECT        0    CONTAINS    qj
SUBJECT        0    CONTAINS    qk
SUBJECT        0    CONTAINS    qm
SUBJECT        0    CONTAINS    qn
SUBJECT        0    CONTAINS    qo
SUBJECT        0    CONTAINS    qp
SUBJECT        0    CONTAINS    qr
SUBJECT        0    CONTAINS    qs
SUBJECT        0    CONTAINS    qt
SUBJECT        0    CONTAINS    qv
SUBJECT        0    CONTAINS    qx
SUBJECT        0    CONTAINS    qy
SUBJECT        0    CONTAINS    qz

SUBJECT        0    CONTAINS    vq
SUBJECT        0    CONTAINS    wq
SUBJECT        0    CONTAINS    tq
SUBJECT        0    CONTAINS    jq

SUBJECT        0    CONTAINS    xd
SUBJECT        0    CONTAINS    xj
SUBJECT        0    CONTAINS    xk
SUBJECT        0    CONTAINS    xr
SUBJECT        0    CONTAINS    xz

SUBJECT        0    CONTAINS    zb
SUBJECT        0    CONTAINS    zc
SUBJECT        0    CONTAINS    zf
SUBJECT        0    CONTAINS    zj
SUBJECT        0    CONTAINS    zk
SUBJECT        0    CONTAINS    zl
SUBJECT        0    CONTAINS    zm
SUBJECT        0    CONTAINS    zx



Matt







Dan Patnode wrote:

Follow-up,

Used in a high weight soft test, 3 of Q subject tests FPd this morning. It seems that Japanese encoded messages like lots of mixed up letters.

More testing...

Dan



On Wednesday, September 10, 2003 19:20, Dan Patnode <[EMAIL PROTECTED]> wrote:


I did a scan of all uncaught spam from the last week, found all
the one's with Q, removed the QU's and ended up with this list.
All of these would have been seen by Matt's new config:


Subject: Block those unwanted Popups yqvqk
Subject: drive luxury cars and get paid 9xP%oY5NzPG\q2G
Subject: drive luxury cars and get paid L0z[7J4aYq!F7P1
Subject: drive luxury cars and get paid 9xP%oY5NzPG\q2G
Subject: drive luxury cars and get paid L0z[7J4aYq!F7P1
Subject: FW: Block those unwanted Popups yqvqk
Subject: FW: drive luxury cars and get paid 9xP%oY5NzPG\q2G
Subject: FW: drive luxury cars and get paid L0z[7J4aYq!F7P1
Subject: FW: get that extra boost in the bed uvqtc qqyixu Subject: FW: new mail REgnfqnKQT
Subject: Fw: :( would u mind if i .. jqvmoiqfkzkokdwns u
Subject: get that extra boost in the bed uvqtc qqyixu
Subject: get that extra boost in the bed uvqtc qqyixu
Subject: Re: new mail REgnfqnKQT
Subject: Re: new mail REgnfqnKQT
Subject: Stop messages SPAM po p vyoaejswayqo
Subject: [Fwd:
=?GB2312?B?0OnE4r/VvOS089PFu92jrDE5OdSqv8nS1L2o0ru49s341b6jrA==?==?GB2312?B?uM+/7LW9d3d3LjA3NTVzei5jb23J6sfrsMld?=



Dan





On Wednesday, September 10, 2003 17:45, Matthew Bramble <[EMAIL PROTECTED]> wrote:


How about 4 different super tests?  I fail automatically on
=?ISO-8859-1?B?, and that accounts for more than 1% of the
E-mail coming in to my server, but only a handful of additional
catches in what was being missed...no false positives.  I think
I've mentioned enough times, the other tests that I would like
to have...a BODYTEXT filter that searches just a decoded
non-HTML body, a NOTEXT test for nothing but spaces and returns
and attachments (that's a key) after decoding and
de-HTMLifying, and a TEXTCOUNT marquee test that would allow
you to search for amounts of non-HTML decoded body text just
just like SUBECTSPACES and BCC, but in reverse (the less there
is, the higher the score).  I could catch so much crap with
those 40 or so two character gibberish strings, in fact I think
it was properly tagging around 10% to 20% of all unique
incoming messages today if not more.  That gibberish subject
filter is tagging over 5% by itself, and with perfect accuracy
so far.  A functional gibberish body filter though would have a
reasonable number of false positives (was tagging buy.com links
that were shown in displayable text for instance).  I don't of
course though expect Scott to rush to my aid here.

I have managed to add though tests for SUBECTSPACES (very
effective), COMMENTS (effective) and BCC (just ok), along with
some small key word/phrase filters for the body, subject and
sender with very good success.  I only saw about 5 definitive
false positives today out of around 3000 unique messages, but
approximately 150 pieces of spam got through.  I think that
could be reduced by as much as half without a measurable impact
on the false positives.  If that doesn't work, I'm buying a gun
:)

BTW, on Linux, my guru buddy recommends Postfix as the SMTP
client and Webmin as the interface.  I don't though dispute
Sandy's faith in MS SMTP, and it can be run on the same box as
IMail.

Matt




Dan Patnode wrote:


FYI, I pulled this test 3 weeks ago after a email from France
came through (or rather didn't) with this subject:

Subject:
=?ISO-8859-1?B?RW5qb3kgc3VtbWVyIHVudGlsIGl0cyB2ZXJ5IGVuZCE=?=

There's definitely is a correlation here among spammers, ?B?
encoded subjects, disposable domain names, and nothing else in
the body of the message.  There has to be a way to bring the 2
or 3 variables togther as a super test.


Dan



On Monday, September 8, 2003 19:05, Matthew Bramble <[EMAIL PROTECTED]> wrote:



Use a text filter and add something like:


SUBJECT 40 CONTAINS =?ISO-8859-1?b?

to it.

I tried this all the way down to ust ?b? and a SUBJECT filter
didn't catch it.  The SUBJECT filter also doesn't catch the
decoded text.

I found though that if you use the HEADERS filter, it will
catch this (customize to suit, this will only catch Latin-1
that is base64 encoded, and I can't think of why that would be
necessary, I would think that only other charactersets could
need this):

HEADERS 10 CONTAINS ISO-8859-1?B?

Neither the HEADERS filter nor the SUBJECT filter is catching
the decoded form of the text.  The BASE64 test is also not
catching this if it's only in the Subject of the message (I
assume it only does the body/attachments).

The not so funny thing is that I'm getting this now as a part
of those E-mails containing no displayable text. This guy is
real good at getting through my settings unless he chooses a
bad IP to send from. I think a few days ago, another person on
this list commented about this same spammer, bringing up the
domains that he is using (common words followed by numbers). The only pattern this guys leaves apart from having no text in
the body, is having different country's TLDs listed in the
Received line, the sender, and the reverse DNS. Here's a copy
of what I just received using this technique (with links
modified):





From - Mon Sep 08 17:36:44 2003


X-UIDL: 314612976
X-Mozilla-Status: 0011
X-Mozilla-Status2: 00000000
Received: from gjr.paknet.com.pk [81.128.130.33] by igaia.com with ESMTP
(SMTPD32-7.13) id A6244F101D8; Mon, 08 Sep 2003 17:35:32 -0400
Date: Mon, 08 Sep 2003 21:35:35 +0000
Message-ID: <[EMAIL PROTECTED]>
X-Mailer: Windows Eudora Pro Version 2.2 (32)
To: [EMAIL PROTECTED]
Subject:
=?ISO-8859-1?B?UmU6T3JkZXIgU2lsZGVuYWZpbCBDaXRyYXRlICBmcm9tIGhvbWUgLSBubyBkb2N0b3IgcmVxdWlyZWQu?=
MIME-Version: 1.0
From: "Shirley Dalton" <[EMAIL PROTECTED]>
Content-Type: text/html
Content-Transfer-Encoding: 8bit
X-Declude-Sender: [EMAIL PROTECTED] [81.128.130.33]
X-Declude-Spoolname: Df62404f101d89e2c.SMD
X-Note: This E-mail was scanned by iGaia Incorporated's E-mail
service (www.igaia.com) for spam.
X-Note: This E-mail was sent from
host81-128-130-33.in-addr.btopenworld.com ([81.128.130.33]).
X-Spam-Tests-Failed: DSN, IPNOTINMX, NOLEGITCONTENT [1]
X-RCPT-TO: <[EMAIL PROTECTED]>
Status: U
X-UIDL: 314612976

<html><body>
<center><!--lfoln42j66--><a
href="http://www-dot-payment33dd-dot-com/host/default.asp?ID=omni";><img
src="http://discountrate2-dot-com/pics/gv1.gif"; height="270" width="405"></a></center>
</html></body>




---
[This E-mail was scanned for viruses by Declude Virus (http://www.declude.com)]

---
This E-mail came from the Declude.JunkMail mailing list.  To
unsubscribe, just send an E-mail to [EMAIL PROTECTED], and
type "unsubscribe Declude.JunkMail".  The archives can be found
at http://www.mail-archive.com.

Reply via email to