It also won't catch things like: "Your Amazon.com order has shipped (#101-4385494-1223513)" or "ORDER NO.B1093613-RFGDEF-01 HAS BEEN SHIPPED OUT"

The last thing that I want to do is FP on ecommerce things. There's some of this stuff with all consonants as well, in very long strings.

I'm not saying it would be a bad test because it wouldn't catch certain things, I'm saying it would be a unreliable test because of what it might catch which isn't intended. Non-gibberish strings like the one in your example are also fairly uncommon. There are other obfuscation techniques though that use punctuation to separate letters which would probably be fairly safe to target as long as you only included only counted them when they are surrounded on both sides by letters. Something like the following:

E.N.L.A^R.G.E

A derivative of the COMMENTS test for the subject. The only issue here is that this stuff is otherwise easy to target with a bunch of other filters and therefore it almost never avoids deletion on my system. I'm watching this one though because it could become much worse. With the new functionality it's also possible to write a filter for this although it's a bit kludgey.

Matt



John Tolmachoff (Lists) wrote:

GIBBERSHSUB would not catch things like BestProductEver and
ImportantPleaseReadNow and so forth.

I have seen a number of spam where the words are run together without spaces
to by pass filters. Being about to count consecutive characters and add a
weight of say nor more that 5 would help.

John Tolmachoff
Engineer/Consultant/Owner
eServices For You




-----Original Message-----
From: [EMAIL PROTECTED] [mailto:Declude.JunkMail-
[EMAIL PROTECTED] On Behalf Of Matthew Bramble
Sent: Friday, January 02, 2004 9:14 AM
To: [EMAIL PROTECTED]
Subject: Re: [Declude.JunkMail] CONSECUTIVECHAR test!

John,

This would FP on messages that include ID's in the subject such as
receipts, and also base64 encoded subjects, some of which are perfectly
valid and Declude doesn't decode subjects at this time.  I also tend to
see receipts with more characters than I tend to see in spam that
appends gibberish.

I don't think this could be made reliable without a good deal of error
detection.  GIBBERISHSUB actually would be a lot more reliable.

Matt




John Tolmachoff (Lists) wrote:




Test suggestion.

This would be like SUBJECTSPACES, instead would count consecutive


characters


other than spaces in the subject line.

CONSECUTIVECHAR consecutivechar 20


x


5 0

John Tolmachoff
Engineer/Consultant/Owner
eServices For You





---
[This E-mail was scanned for viruses by Declude Virus (http://www.declude.com)]

---
This E-mail came from the Declude.JunkMail mailing list.  To
unsubscribe, just send an E-mail to [EMAIL PROTECTED], and
type "unsubscribe Declude.JunkMail".  The archives can be found
at http://www.mail-archive.com.

Reply via email to