I approach it this way with my external test. 1. I attempt to base64 decode the subject if it is encoded. 2. Test 1 = I substitute foreign letters (accented vowels and such) with their English counterpart. Then all non A-Z are removed. 3. Test 2 = Foreign substitute, then @ to A, 0 to O, 5 and $ to S, !,|,and 1 to I, \/ to v, 3 to E. ThThen all non A-Z are removed. 4. Test 3 = Foreign substitute, then @ to A, 0 to O, 5 and $ to S, !,|,and 1 to L, \/ to v, 3 to E. Then all non A-Z are removed.
You can then test against a filter file containing the standard subject words. I have tests for ALTSUBJECTONLY which would fail test1 test2 or test3 if not in the original subject. So degree is ok but d*egree fails. That said, it really doesn't improve my mail detection. The items it catches are usually scoring high enough that the extra points aren't needed. It was mostly a visual basic challenge for me to see if I could do it. ----- Original Message ----- From: "Darin Cox" <[EMAIL PROTECTED]> To: <[EMAIL PROTECTED]> Sent: Thursday, October 07, 2004 2:33 PM Subject: Re: [Declude.JunkMail] Filter File - Maximum Size? > One of the most common misspellings I see is v1agra. According to your > logic, this wouldn't get caught, would it? > > Perhaps amend the test to do some standard replacements of numbers with > letters? For example, > > 0 -> o > 1 -> i > 3 -> e > 5 -> s > 8 -> a > > Darin. > > > ----- Original Message ----- > From: "Paul Fuhrmeister" <[EMAIL PROTECTED]> > To: <[EMAIL PROTECTED]> > Sent: Thursday, October 07, 2004 3:17 PM > Subject: RE: [Declude.JunkMail] Filter File - Maximum Size? > > > We wrote an external program that > > 1. Works with Declude as an external filter, > 2. reads the email and picks out the subject line, > 3. reads a very short list of words from a text file, > 4. looks for the words in the subject line, then > 5. strips all of the non-alpha characters out of the subject line (including > numbers and spaces), > 6. looks for the words in the subject line AGAIN, > 7. returns a DOS error number ONLY if a banned word appears AFTER stripping > out the non-alpha characters, and > 8. keeps a log file identifying each message that failed and why. > > It only leaves about 5 ways to spell viagra. The after but not before test > avoids false positives. We weight it 20 on our 20 point scale, but we're not > aggressive with our word list. > > You have to be careful with your word list because we strip the spaces, some > words are contained in other words, etc. > > I guess you could change it up and check the first 250 characters of the > message body or something, but it doesn't deal with html. > > I can post the source code if anyone's interested (it' Visual Basic complied > to an exe). > > Paul Fuhrmeister > [EMAIL PROTECTED] > > > > > ________________________________ > > From: [EMAIL PROTECTED] > [mailto:[EMAIL PROTECTED] On Behalf Of Aaron Moreau-Cook > Sent: Tuesday, October 05, 2004 6:51 PM > To: [EMAIL PROTECTED] > Subject: RE: [Declude.JunkMail] Filter File - Maximum Size? > > > Thanks for the response. We are already using Sniffer; if a message triggers > Sniffer we give the e-mail 60% of our delete weight. This works great, trust > me... but I'm sick and tired of seeing w^^o_r-d#s l-!+k^e this in my hold > queue. > > The problem is, how many ways can you spell a word? How many ^,*,$,#, and > other characters can you put into a word to slip by Sniffer? Apparently > there are 360,000 to spell Viagra by inserting these characters (and others) > and changing certain letters to numbers. > > I'm frustrated by spammers, I know we all are so I'm just trying to find out > if this is *even* a viable way to help declude stop spam. > > Thanks > > > --- > [This E-mail was scanned for viruses by Declude Virus > (http://www.declude.com)] > > --- > This E-mail came from the Declude.JunkMail mailing list. To > unsubscribe, just send an E-mail to [EMAIL PROTECTED], and > type "unsubscribe Declude.JunkMail". The archives can be found > at http://www.mail-archive.com. > > --- > [This E-mail was scanned for viruses by Declude Virus (http://www.declude.com)] > > --- > This E-mail came from the Declude.JunkMail mailing list. To > unsubscribe, just send an E-mail to [EMAIL PROTECTED], and > type "unsubscribe Declude.JunkMail". The archives can be found > at http://www.mail-archive.com. > --- [This E-mail was scanned for viruses by Declude Virus (http://www.declude.com)] --- This E-mail came from the Declude.JunkMail mailing list. To unsubscribe, just send an E-mail to [EMAIL PROTECTED], and type "unsubscribe Declude.JunkMail". The archives can be found at http://www.mail-archive.com.
