https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7851

Bill Cole <[email protected]> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
         Resolution|---                         |FIXED
             Status|NEW                         |RESOLVED

--- Comment #5 from Bill Cole <[email protected]> ---
(In reply to RW from comment #1)
> meta MIME_CHARSET_FARAWAY     (__MIME_CHARSET_FARAWAY && __HIGHBITS)
> body __HIGHBITS                     /(?:[\x80-\xff].?){4}/
> 
> __HIGHBITS seems redundant as __MIME_CHARSET_FARAWAY includes a check that
> the rendered body octets are mostly high-bit.

Can you point me at that code? I don't see it and the attached example that hit
the bug certainly isn't mostly or even significantly high-bit, with the only
non-ascii I can find being a single 4-byte character in the properly-encoded
utf-8 Subject header (which SA prepends to the body for scanning.) 

> 'ascii' is not an official alias for US-ASCII and if it really is in ASCII
> than it shouldn't be mostly high-bit.

See https://www.iana.org/assignments/character-sets/character-sets.xhtml

The intro text on that page says that 'ASCII' is or was used as an alternative
name for US-ASCII. A careless reading could result in one believing it to be
acceptable as a MIME charset. At least one antique MUA (MailForge for MacOS X)
and one significant current ESP (MailGun) have created mail with
"charset=ascii"  in the Content-Type header.

> I think the real problem may be that there is no check that there are
> high-bits outside of the decoded Subject line.

That would address this specific case. It would not address the underlying fact
of mail composers using charset=ascii where they should use charset=us-ascii. 


This has been fixed in r1881066.

-- 
You are receiving this mail because:
You are the assignee for the bug.

Reply via email to