https://issues.apache.org/SpamAssassin/show_bug.cgi?id=6945

            Bug ID: 6945
           Summary: sa-learn dies on non-ASCII characters in Message-ID:
                    (when LANG=C)
           Product: Spamassassin
           Version: 3.3.1
          Hardware: PC
                OS: Linux
            Status: NEW
          Severity: normal
          Priority: P2
         Component: Learner
          Assignee: [email protected]
          Reporter: [email protected]

sa-learn dies with the following error if a message contains non-ASCII
characters in host name portion of Message-ID:

plugin: eval failed: bayes: (in learn) Wide character in subroutine entry at
/usr/share/perl5/Mail/SpamAssassin/BayesStore/DBM.pm line 826.
ERROR: the Bayes learn function returned an error, please re-run with -D for
more information at /usr/bin/sa-learn line 493.

In this particular case the offending Message-ID line contains Cyrillic
characters encoded in CP-1251, 
If decoded from CP-1251 to UTF-8 the line would read:

Message-ID: <8B8D9181744049F381C34F99B078DC9B@Костя-ПК>

After removing this line from the message sa-learn worked just fine.

A sample (heavily reducted and massaged) message in mbox format is attached.
Try the following command to reproduce the problem:
LANG=C sa-learn --spam --mbox sample.mbox

-- 
You are receiving this mail because:
You are the assignee for the bug.

Reply via email to