I have another false positive that I just caught from an inline image that didn't trip the BASE64 filter or contain the attachment marker. This is standard behavior for E-mail, so I'm going to have to figure out another way to not score such content. It will probably end up necessary to place the exception testing in a different filter so that it doesn't hit more than one exception at a time. Spammers use inline images on a rare occasion and I would hate to take extra points away from them.
BTW, both gibberish filters should remove the qo combination due to 'QO'S, qb because of 'QB', qv because of 'QV'C, and qi because of 'Qi' and other Chinese names. The list of combinations is starting to get smaller, however there is a limit to how tight the test should be. I've been using Google as a benchmark for letter combinations, qu for instance scores 41,500,000 results (allowed), qb scores 2,600,000 results, qi scores 2,360,000, but jq only scores 838,000. Seems that anything around 1,500,000 or less is about as good as it gets. This doesn't include though when the letters appear inside of a dictionary word, and that should be almost nonexistent. The goal is to find the least common of all. Needless to say, there are enough exceptions to score low no matter how refined it is, however it seems to be scoring about 98% valid hits on spam even with the obvious limitations. I'll post another copy of my file when I figure out the PGP and inline problems. If anyone has any pointers on other inline Base64 stuff, I'd appreciate hearing it. It's important to exclude everything that the BASE64 test doesn't catch, so knowing the strict criteria there helps (i.e. what does it look for). This might also include needing to exclude some inline text, I'm not sure yet. Still works pretty good though.
And thanks to Kami for the kind words :)
Thanks,
Matt
--- [This E-mail was scanned for viruses by Declude Virus (http://www.declude.com)]
--- This E-mail came from the Declude.JunkMail mailing list. To unsubscribe, just send an E-mail to [EMAIL PROTECTED], and type "unsubscribe Declude.JunkMail". The archives can be found at http://www.mail-archive.com.
