Re: double letter porn

Chris St. Pierre Thu, 05 Oct 2006 13:22:13 -0700

One thing I've wondered/thought about is using the Levenshtein
difference between the words in an email and a list of spam words
(ideally pulled from the bayes db).  In this case, all of the
misspelled words in that sample have a L-distance of 1 from the real
word -- in other words, they're *very* close.

I think the problem would be that this would consume tons of
resources.  Anything else, though, would be susceptible to other typo
attacks.  For instance, say you took each email, and replaced all
doubled letters with single letters, it wouldn't be long before you
were getting spam advertising "analr bictches" or the like.

Chris St. Pierre
Unix Systems Administrator
Nebraska Wesleyan University

On Wed, 4 Oct 2006, Eric A. Hall wrote:

>
>On 10/4/2006 5:57 PM, Richard Doyle wrote:
>> I've been getting lots of porn site spam containing words with doubled
>> letters, like this one:
>
>> Can anybody suggest a rule or ruleset to catch these double-letter
>> obfuscations? I'm using Spamassassin 3.1.4.
>
>You'd probably need to write a plug-in that used some kind of
>typo-matching logic to find porno words.
>
>Would be a good plug-in actually. Get busy :)
>
>-- 
>Eric A. Hall                                        http://www.ehsco.com/
>Internet Core Protocols          http://www.oreilly.com/catalog/coreprot/
>

Re: double letter porn

Reply via email to