I've had to deal with quite a bit of obfuscated spam over the years.
I started out having every possible obfuscation in every rule, and
whenever i discovered a new one, i needed to go back and update every
single rule with the new one. The rules were massive and completely
unreadable.
Then i discovered replace_tags, which i can highly recommend looking
into, if you haven't already:
https://spamassassin.apache.org/full/3.1.x/doc/Mail_SpamAssassin_Plugin_ReplaceTags.html
https://github.com/apache/spamassassin/blob/trunk/rules/25_replace.cf
Using this made the rules so much easier to read when you come back to
them 6 months from now, and it's much easier to reuse the same
obfuscations. Just update it in one place and it applies to all rules
using them.
(Sorry, that sounded like a horrible sales-pitch from a TV-advertisement
or something..)
I've found the builtin rules are occasionally missing some special
characters, so i made a replace_tag for every letter where i include the
built-in one. Here's a couple of examples:
replace_tag CUSTOM_C (<C>|\xe1\xb4\x84)
replace_tag CUSTOM_N (<N>|\xe2\x93\x9d|\xc6[\x9e\x9d]|\xef\xbd\x8e)
replace_tag CUSTOM_V (<V>)
Then i can add other custom characters i find to each letter there, if
the built-in rules are not catching the obfuscation.
I've found the easiest way to get the characters is a quick python for-loop:
>>> for c in "ṣҿṽҿral":
... print(f"{c}: {c.encode('utf8')}")
...
ṣ: b'\xe1\xb9\xa3'
ҿ: b'\xd2\xbf'
ṽ: b'\xe1\xb9\xbd'
ҿ: b'\xd2\xbf'
r: b'r'
a: b'a'
l: b'l'
In the end, you can make either one rule that catches both the normal
and obfuscated versions, or separate them so you can punish obfuscated
versions even harder:
body __BODY_VIAGRA
/(^|[^a-zA-Z0-9\.]|<CUSTOM_WORD_SEP>)viagra([^a-zA-Z0-9]|$)/i
body __BODY_VIAGRA_OBF
/(^|[^a-zA-Z0-9]|<CUSTOM_WORD_SEP>)(?!\bviagra\b)<CUSTOM_V><CUSTOM_I><CUSTOM_A><CUSTOM_G><CUSTOM_R><CUSTOM_A>([^a-zA-Z0-9]|$)/i
replace_rules __BODY_VIAGRA __BODY_VIAGRA_OBF
I would say start out with the built-in ones from the 25_replace.cf
file, and if you see they're not catching certain characters, start
creating your own versions and add those characters.
As others have pointed out, it might cause issues if you actually have
people writing in languages that use those special characters, but
that's the eternal joy of managing a spam-filter..
On 12/15/25 2:04 AM, Mark London wrote:
Hi - One of users got a bitcoin blackmail email, that use special
characters to avoid the bitcoin spam rules. Does anybody have rules
that detect this type of obfuscation? Thanks. - Mark
Begin forwarded message:
*From:* Ashley Adkins <[email protected]>
*Date:* December 12, 2025 at 3:51:30 PM EST
*Subject:* *Reminder! Check this message now*
Greetings!
I nҿҿd to inform bad nĕwṣ with you.
Approximately ṣҿṽҿral monthṡ ago I obtainễd accȩṡṡ to your gadgễtŝ,
which you uṩẽ for wҿb _(krxvtgqb) _ṣurfing. Aftҿr that, I _(qofyata)
_haⱱê ṥtartȅd tracking your intẹrnẹt activities.
Here iṩ thḗ ṣȇquȇncȇ of events:
--
Martin Flygenring (maf)
Systems Engineer, group.one / one.com