The way we handle it in http://www.pccc.com/downloads/SpamAssassin/contrib/KAM.cf is to use a regex like /this.advertisement/ unanchored by \b.

When matching against phrases like yours, we find the word boundary does not add any specificity to the rule because the odds of matching against a different word or phrase is nil, and we catch almost every obfuscation of word boundaries.

Good catch though, we do have some rules in KAM.cf that can be avoided by this, and off the top of my head I can think of several stock SA rules that are vulnerable too.

On 6/5/2014 9:44 PM, John Hardin wrote:
All:

I've run across a new text obfuscation method in active use by spammers. It appears to be an attempt to bypass RE-based text matching of words. Rules you write will need modification to not be spoofed by this.

Unfortunately the RE engine considers the underscore as being a "word" character, so a rule like /\bthis advertisement\b/ can be defeated by replacing the spaces in the sentence with underscores. This is still readable to a human but foils the word-boundary check.

Recommendation: instead of a bare \b, use (?:\b|_) and instead of embedded spaces use [-_\s]

Examples:

Manage_advertising_preferences_here

To_remove_yourself_from_this_admail,_please_do_so_here

Be_removed_from_this_important_offer


Reply via email to