https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7144
--- Comment #4 from Mark Martinec <[email protected]> --- > In looking at this, if +1's we will need to make Encode::Detect a > requirement rather than optional Not necessarily. The Encode::Detect is now only used rarely if other attempts fail - unlike previously in 3.4.0, where the module was essential for operation. I wouldn't even care much for this module, but I kept it as it's been there in use before. It is still flagged as optional in the DependencyInfo.pm, and its importance is played down in the DependencyInfo's report. > Also need to update the UPGRADE and README to reflect this change > if we get another +1. I wonder how effective these current drugs misspellings rules are, which assume Latin1 encoding. I haven't noticed degradation when I began playing with normalize_charset and turned it on (rendering them ineffective), but that's just anecdotal. Currently I don't see an easy way to let rules know what encoding they are dealing with, so can't make them conditional (or tflagged). One possibility is to use 'rawbody' instead of 'body' for such rules that expect original encoding of a message. Rawbody avoids charset normalization, but also avoids decoding HTML (which may or may not affect them). I don't have a strong opinion on the default value of normalize_charset. For our site I certainly want it on (regardless of possibly rendering some stock rules ineffective), as it makes it easier to write rules for non-English text. Perhaps a gentle nudge in the release notes to suggest people to turn normalize_charset on when upgrading to 3.4.1, but leaving a default unchanged for this minor version update? The drag is that there will be some users base staying on pre-3.4.1 version for quite some time still, yet keeping their rules up-to-date. -- You are receiving this mail because: You are the assignee for the bug.
