https://issues.apache.org/SpamAssassin/show_bug.cgi?id=7068

            Bug ID: 7068
           Summary: Plugin which counts unicode entities in text/plain
                    MIME parts
           Product: Spamassassin
           Version: SVN Trunk (Latest Devel Version)
          Hardware: PC
                OS: Linux
            Status: NEW
          Severity: normal
          Priority: P2
         Component: Plugins
          Assignee: [email protected]
          Reporter: [email protected]

Created attachment 5221
  --> https://issues.apache.org/SpamAssassin/attachment.cgi?id=5221&action=edit
First attempt, patch against svn trunk

A MIME part claiming to be text/plain and containing Unicode characters must be
encoded as quoted-printable or base64.  Any message in 7bit or 8bit encoding
containing (HTML) Unicode entities will not render them as Unicode, but
literally.

Thus a few such sequences might occur on a mailing list of developers
discussing such characters, but a message with a high density of such
characters is likely spam.

Here we propose such a check (attached as diffs), and an example rule:

if can(Mail::SpamAssassin::Plugin::MIMEEval::has_check_abundant_unicode_ratio)
body T_PP_TOO_MUCH_UNICODE       
eval:check_abundant_unicode_ratio('T_PP_TOO_MUCH_UNICODE', 0.05)
describe T_PP_TOO_MUCH_UNICODE    Is text/plain but has many unicode escapes
score T_PP_TOO_MUCH_UNICODE    1.0
endif

-- 
You are receiving this mail because:
You are the assignee for the bug.

Reply via email to