Chris Bayliss wrote: >>> We're currently using a custom written smtp server that filters "bad words". >>> >> >> Cue the Scunthorpe problem. > > We used to filter along these lines a long time ago until better > things were available. What amazed me was the number of surnames that > fell foul of the filter, such as Wank, Cock and Cunther (all real > examples). > > The other issue is that mis-spelling is reasonably comon to evade > filters. Once you try matching similar words, the problem of > false positives gets worse.
I think you could cover most of the cases quite easily with a small amount of effort in the regex creation. Ie, use word boundaries, account for obvious obfuscation tricks, and miss-spellings. /\bw[a4nk(s|z|[e30o]r[sz5]?)\b/ However, this doesn't get rid of the case where somebody might have a swear word for a surname. That's when it might become a good idea to star out the word, rather than block the entire email. Depends why you're filtering though I suppose. -- Mike Cardwell IT Consultant .. http://cardwellit.com/ -- ## List details at http://lists.exim.org/mailman/listinfo/exim-users ## Exim details at http://www.exim.org/ ## Please use the Wiki with this list - http://wiki.exim.org/
