https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7253
Bug ID: 7253
Summary: X-Spam-Report incorrectly mime-encodes multiline
report in header, violating RFC 2047
Product: Spamassassin
Version: 3.4.1
Hardware: All
OS: All
Status: NEW
Severity: normal
Priority: P2
Component: Libraries
Assignee: [email protected]
Reporter: [email protected]
With
report_safe 0
and a rule with non-ASCII description, e.g.:
header L_TEST_REPORT_ENCODING From =~ /./
score L_TEST_REPORT_ENCODING 0.01
describe L_TEST_REPORT_ENCODING En-tête contient caractères
the resulting X-Spam-Report multiline header field as inserted
into spam messages is incorrectly encoded into encoded-words:
the whole multiline header field is encoded into a single
encoded-words, whitespace is not encoded, the result contains
whitespace within encoded-word, and the encoded-word spans across
lines:
X-Spam-Report: =?UTF-8?Q?
* 100 USER_IN_BLACKLIST From: address is in the user's black-list
* 0.0 L_TEST_REPORT_ENCODING En-t=c3=aate contient caract=c3=a8res
* -0.3 BAYES_05 BODY: Bayes spam probability is 1 to 5%
* [score: 0.0137]
* 0.3 TXREP TXREP: Score normalizing based on sender's reputation?=
This is wrong on multiple accounts. The RFC 2047 is explicit:
An 'encoded-word' may not be more than 75 characters long, including
'charset', 'encoding', 'encoded-text', and delimiters.
[...]
IMPORTANT: 'encoded-word's are designed to be recognized as 'atom's
by an RFC 822 parser. As a consequence, unencoded white space
characters (such as SPACE and HTAB) are FORBIDDEN within an
'encoded-word'. For example, the character sequence
=?iso-8859-1?q?this is some text?=
would be parsed as four 'atom's, rather than as a single 'atom' (by
an RFC 822 parser) or 'encoded-word' (by a parser which understands
'encoded-words'). The correct way to encode the string "this is some
text" is to encode the SPACE characters as well, e.g.
=?iso-8859-1?q?this=20is=20some=20text?=
[...]
Only a subset of the printable ASCII characters may be used in
'encoded-text'. Space and tab characters are not allowed, so that
the beginning and end of an 'encoded-word' are obvious.
The culprit is MS::PerMsgStatus::qp_encode_header().
It should encode (when necessary) each line individually,
and should encode whitespace within encoded-word(s).
--
You are receiving this mail because:
You are the assignee for the bug.