https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7851
Bill Cole <[email protected]> changed: What |Removed |Added ---------------------------------------------------------------------------- Resolution|--- |FIXED Status|NEW |RESOLVED --- Comment #5 from Bill Cole <[email protected]> --- (In reply to RW from comment #1) > meta MIME_CHARSET_FARAWAY (__MIME_CHARSET_FARAWAY && __HIGHBITS) > body __HIGHBITS /(?:[\x80-\xff].?){4}/ > > __HIGHBITS seems redundant as __MIME_CHARSET_FARAWAY includes a check that > the rendered body octets are mostly high-bit. Can you point me at that code? I don't see it and the attached example that hit the bug certainly isn't mostly or even significantly high-bit, with the only non-ascii I can find being a single 4-byte character in the properly-encoded utf-8 Subject header (which SA prepends to the body for scanning.) > 'ascii' is not an official alias for US-ASCII and if it really is in ASCII > than it shouldn't be mostly high-bit. See https://www.iana.org/assignments/character-sets/character-sets.xhtml The intro text on that page says that 'ASCII' is or was used as an alternative name for US-ASCII. A careless reading could result in one believing it to be acceptable as a MIME charset. At least one antique MUA (MailForge for MacOS X) and one significant current ESP (MailGun) have created mail with "charset=ascii" in the Content-Type header. > I think the real problem may be that there is no check that there are > high-bits outside of the decoded Subject line. That would address this specific case. It would not address the underlying fact of mail composers using charset=ascii where they should use charset=us-ascii. This has been fixed in r1881066. -- You are receiving this mail because: You are the assignee for the bug.
