https://issues.apache.org/SpamAssassin/show_bug.cgi?id=7068
Bug ID: 7068
Summary: Plugin which counts unicode entities in text/plain
MIME parts
Product: Spamassassin
Version: SVN Trunk (Latest Devel Version)
Hardware: PC
OS: Linux
Status: NEW
Severity: normal
Priority: P2
Component: Plugins
Assignee: [email protected]
Reporter: [email protected]
Created attachment 5221
--> https://issues.apache.org/SpamAssassin/attachment.cgi?id=5221&action=edit
First attempt, patch against svn trunk
A MIME part claiming to be text/plain and containing Unicode characters must be
encoded as quoted-printable or base64. Any message in 7bit or 8bit encoding
containing (HTML) Unicode entities will not render them as Unicode, but
literally.
Thus a few such sequences might occur on a mailing list of developers
discussing such characters, but a message with a high density of such
characters is likely spam.
Here we propose such a check (attached as diffs), and an example rule:
if can(Mail::SpamAssassin::Plugin::MIMEEval::has_check_abundant_unicode_ratio)
body T_PP_TOO_MUCH_UNICODE
eval:check_abundant_unicode_ratio('T_PP_TOO_MUCH_UNICODE', 0.05)
describe T_PP_TOO_MUCH_UNICODE Is text/plain but has many unicode escapes
score T_PP_TOO_MUCH_UNICODE 1.0
endif
--
You are receiving this mail because:
You are the assignee for the bug.