Not bad. Makes me wonder if the future test grouping feature would be even stronger with exclusive as well as inclusive grouping. Must have (1) and (2) but not (3).
That would rock! :) Dan On Thursday, September 11, 2003 15:05, Matthew Bramble <[EMAIL PROTECTED]> wrote: >Dan, > >There's a decent way around that. You can set the test in the Config >file for a solid weight, not score each filter test incrementally, and >then provide a list of negative tests that would offset the test. So if >there is some sort of ISO tagging of this Japanese stuff, you can find >that code and defeat the test from running. Same goes for >other languages. > >I just got my first false positive out of 200 catches. This was from >Korea but written in English (still encoded though). There are two >clues in the headers as to how to defeat the test: > >Subject: [22] =?euc-kr?B?R2VuZXJhbCBJbnF1aXJ5IGZvciBzbm93bW9iaWxl?= >Content-Type: text/html; charset=euc-kr > >You could probably do something like the following (suggested >replacement for the original filter if you are using it): > > > >GIBBERISHSUB filter >C:\IMail\Declude\Filters\GibberishSub.txt x 5 0 > ># The following defeats the test if it finds the subject is not sent as >ASCII > >SUBJECT -5 CONTAINS ?b? > ># Small list of letter combinations not found in a basic >dictionary. > >SUBJECT 0 CONTAINS qb >SUBJECT 0 CONTAINS qc >SUBJECT 0 CONTAINS qd >SUBJECT 0 CONTAINS qe >SUBJECT 0 CONTAINS qf >SUBJECT 0 CONTAINS qg >SUBJECT 0 CONTAINS qh >SUBJECT 0 CONTAINS qi >SUBJECT 0 CONTAINS qj >SUBJECT 0 CONTAINS qk >SUBJECT 0 CONTAINS qm >SUBJECT 0 CONTAINS qn >SUBJECT 0 CONTAINS qo >SUBJECT 0 CONTAINS qp >SUBJECT 0 CONTAINS qr >SUBJECT 0 CONTAINS qs >SUBJECT 0 CONTAINS qt >SUBJECT 0 CONTAINS qv >SUBJECT 0 CONTAINS qx >SUBJECT 0 CONTAINS qy >SUBJECT 0 CONTAINS qz > >SUBJECT 0 CONTAINS vq >SUBJECT 0 CONTAINS wq >SUBJECT 0 CONTAINS tq >SUBJECT 0 CONTAINS jq > >SUBJECT 0 CONTAINS xd >SUBJECT 0 CONTAINS xj >SUBJECT 0 CONTAINS xk >SUBJECT 0 CONTAINS xr >SUBJECT 0 CONTAINS xz > >SUBJECT 0 CONTAINS zb >SUBJECT 0 CONTAINS zc >SUBJECT 0 CONTAINS zf >SUBJECT 0 CONTAINS zj >SUBJECT 0 CONTAINS zk >SUBJECT 0 CONTAINS zl >SUBJECT 0 CONTAINS zm >SUBJECT 0 CONTAINS zx > > > >Matt > > > > > > > >Dan Patnode wrote: > >>Follow-up, >> >>Used in a high weight soft test, 3 of Q subject tests FPd this >morning. It seems that Japanese encoded messages like lots of mixed up letters. >> >>More testing... >> >>Dan >> >> >> >>On Wednesday, September 10, 2003 19:20, Dan Patnode <[EMAIL PROTECTED]> wrote: >> >> >>>I did a scan of all uncaught spam from the last week, found all >>>the one's with Q, removed the QU's and ended up with this list. >>>All of these would have been seen by Matt's new config: >>> >>> >>>Subject: Block those unwanted Popups yqvqk >>>Subject: drive luxury cars and get paid 9xP%oY5NzPG\q2G >>>Subject: drive luxury cars and get paid L0z[7J4aYq!F7P1 >>>Subject: drive luxury cars and get paid 9xP%oY5NzPG\q2G >>>Subject: drive luxury cars and get paid L0z[7J4aYq!F7P1 >>>Subject: FW: Block those unwanted Popups yqvqk >>>Subject: FW: drive luxury cars and get paid 9xP%oY5NzPG\q2G >>>Subject: FW: drive luxury cars and get paid L0z[7J4aYq!F7P1 >>>Subject: FW: get that extra boost in the bed uvqtc qqyixu >>>Subject: FW: new mail REgnfqnKQT >>>Subject: Fw: :( would u mind if i .. jqvmoiqfkzkokdwns u >>>Subject: get that extra boost in the bed uvqtc qqyixu >>>Subject: get that extra boost in the bed uvqtc qqyixu >>>Subject: Re: new mail REgnfqnKQT >>>Subject: Re: new mail REgnfqnKQT >>>Subject: Stop messages SPAM po p vyoaejswayqo >>>Subject: [Fwd: >>>=?GB2312?B?0OnE4r/VvOS089PFu92jrDE5OdSqv8nS1L2o0ru49s341b6jrA==?==?GB2312?B?uM+/7LW9d3d3LjA3NTVzei5jb23J6sfrsMld?= >>> >>> >>>Dan >>> >>> >>> >>> >>>On Wednesday, September 10, 2003 17:45, Matthew Bramble <[EMAIL PROTECTED]> wrote: >>> >>> >>>>How about 4 different super tests? I fail automatically on >>>>=?ISO-8859-1?B?, and that accounts for more than 1% of the >>>>E-mail coming in to my server, but only a handful of additional >>>>catches in what was being missed...no false positives. I think >>>>I've mentioned enough times, the other tests that I would like >>>>to have...a BODYTEXT filter that searches just a decoded >>>>non-HTML body, a NOTEXT test for nothing but spaces and returns >>>>and attachments (that's a key) after decoding and >>>>de-HTMLifying, and a TEXTCOUNT marquee test that would allow >>>>you to search for amounts of non-HTML decoded body text just >>>>just like SUBECTSPACES and BCC, but in reverse (the less there >>>>is, the higher the score). I could catch so much crap with >>>>those 40 or so two character gibberish strings, in fact I think >>>>it was properly tagging around 10% to 20% of all unique >>>>incoming messages today if not more. That gibberish subject >>>>filter is tagging over 5% by itself, and with perfect accuracy >>>>so far. A functional gibberish body filter though would have a >>>>reasonable number of false positives (was tagging buy.com links >>>>that were shown in displayable text for instance). I don't of >>>>course though expect Scott to rush to my aid here. >>>> >>>>I have managed to add though tests for SUBECTSPACES (very >>>>effective), COMMENTS (effective) and BCC (just ok), along with >>>>some small key word/phrase filters for the body, subject and >>>>sender with very good success. I only saw about 5 definitive >>>>false positives today out of around 3000 unique messages, but >>>>approximately 150 pieces of spam got through. I think that >>>>could be reduced by as much as half without a measurable impact >>>>on the false positives. If that doesn't work, I'm buying a gun >>>>:) >>>> >>>>BTW, on Linux, my guru buddy recommends Postfix as the SMTP >>>>client and Webmin as the interface. I don't though dispute >>>>Sandy's faith in MS SMTP, and it can be run on the same box as >>>>IMail. >>>> >>>>Matt >>>> >>>> >>>> >>>> >>>>Dan Patnode wrote: >>>> >>>>FYI, I pulled this test 3 weeks ago after a email from France >>>>came through (or rather didn't) with this subject: >>>> >>>>Subject: >>>>=?ISO-8859-1?B?RW5qb3kgc3VtbWVyIHVudGlsIGl0cyB2ZXJ5IGVuZCE=?= >>>> >>>>There's definitely is a correlation here among spammers, ?B? >>>>encoded subjects, disposable domain names, and nothing else in >>>>the body of the message. There has to be a way to bring the 2 >>>>or 3 variables togther as a super test. >>>> >>>> >>>>Dan >>>> >>>> >>>>On Monday, September 8, 2003 19:05, Matthew Bramble <[EMAIL PROTECTED]> wrote: >>>> >>>> >>>>Use a text filter and add something like: >>>> >>>>SUBJECT 40 CONTAINS =?ISO-8859-1?b? >>>> >>>>to it. >>>> >>>>I tried this all the way down to ust ?b? and a SUBJECT filter >>>>didn't catch it. The SUBJECT filter also doesn't catch the >>>>decoded text. >>>> >>>>I found though that if you use the HEADERS filter, it will >>>>catch this (customize to suit, this will only catch Latin-1 >>>>that is base64 encoded, and I can't think of why that would be >>>>necessary, I would think that only other charactersets could >>>>need this): >>>> >>>> HEADERS 10 CONTAINS ISO-8859-1?B? >>>> >>>>Neither the HEADERS filter nor the SUBJECT filter is catching >>>>the decoded form of the text. The BASE64 test is also not >>>>catching this if it's only in the Subject of the message (I >>>>assume it only does the body/attachments). >>>> >>>>The not so funny thing is that I'm getting this now as a part >>>>of those E-mails containing no displayable text. This guy is >>>>real good at getting through my settings unless he chooses a >>>>bad IP to send from. I think a few days ago, another person on >>>>this list commented about this same spammer, bringing up the >>>>domains that he is using (common words followed by numbers). >>>>The only pattern this guys leaves apart from having no text in >>>>the body, is having different country's TLDs listed in the >>>>Received line, the sender, and the reverse DNS. Here's a copy >>>>of what I just received using this technique (with links >>>>modified): >>>> >>>> >>>> >>>> >>>>>From - Mon Sep 08 17:36:44 2003 >>>> >>>> >>>>X-UIDL: 314612976 >>>>X-Mozilla-Status: 0011 >>>>X-Mozilla-Status2: 00000000 >>>>Received: from gjr.paknet.com.pk [81.128.130.33] by igaia.com with ESMTP >>>>(SMTPD32-7.13) id A6244F101D8; Mon, 08 Sep 2003 17:35:32 -0400 >>>>Date: Mon, 08 Sep 2003 21:35:35 +0000 >>>>Message-ID: <[EMAIL PROTECTED]> >>>>X-Mailer: Windows Eudora Pro Version 2.2 (32) >>>>To: [EMAIL PROTECTED] >>>>Subject: >>>>=?ISO-8859-1?B?UmU6T3JkZXIgU2lsZGVuYWZpbCBDaXRyYXRlICBmcm9tIGhvbWUgLSBubyBkb2N0b3IgcmVxdWlyZWQu?= >>>>MIME-Version: 1.0 >>>>From: "Shirley Dalton" <[EMAIL PROTECTED]> >>>>Content-Type: text/html >>>>Content-Transfer-Encoding: 8bit >>>>X-Declude-Sender: [EMAIL PROTECTED] [81.128.130.33] >>>>X-Declude-Spoolname: Df62404f101d89e2c.SMD >>>>X-Note: This E-mail was scanned by iGaia Incorporated's E-mail >>>>service (www.igaia.com) for spam. >>>>X-Note: This E-mail was sent from >>>>host81-128-130-33.in-addr.btopenworld.com ([81.128.130.33]). >>>>X-Spam-Tests-Failed: DSN, IPNOTINMX, NOLEGITCONTENT [1] >>>>X-RCPT-TO: <[EMAIL PROTECTED]> >>>>Status: U >>>>X-UIDL: 314612976 >>>> >>>><html><body> >>>><center><!--lfoln42j66--><a >>>>href="http://www-dot-payment33dd-dot-com/host/default.asp?ID=omni"><img >>>>src="http://discountrate2-dot-com/pics/gv1.gif" height="270" >>>>width="405"></a></center> >>>></html></body> >>>> >>>> > > >--- >[This E-mail was scanned for viruses by Declude Virus >(http://www.declude.com)] > >--- >This E-mail came from the Declude.JunkMail mailing list. To >unsubscribe, just send an E-mail to [EMAIL PROTECTED], and >type "unsubscribe Declude.JunkMail". The archives can be found >at http://www.mail-archive.com. > --- [This E-mail was scanned for viruses by Declude Virus (http://www.declude.com)] --- This E-mail came from the Declude.JunkMail mailing list. To unsubscribe, just send an E-mail to [EMAIL PROTECTED], and type "unsubscribe Declude.JunkMail". The archives can be found at http://www.mail-archive.com.
