https://issues.apache.org/SpamAssassin/show_bug.cgi?id=7126
Mark Martinec <[email protected]> changed: What |Removed |Added ---------------------------------------------------------------------------- Attachment #5271|0 |1 is obsolete| | Attachment #5272|0 |1 is obsolete| | --- Comment #8 from Mark Martinec <[email protected]> --- Created attachment 5277 --> https://issues.apache.org/SpamAssassin/attachment.cgi?id=5277&action=edit The suggested replacement subroutine MS::Message::Node::_normalize() - V2 In view of: [Bug 7133] Revisiting Bug 4046 - HTML::Parser: Parsing of undecoded UTF-8 will give garbage when decoding entities, and HTML::Parser bug: https://rt.cpan.org/Public/Bug/Display.html?id=99755 it seems desirable to be able to obtain from sub _normalize either decoded characters (Unicode), or encoded as UTF-8 octets, so I have generalized the proposed replacement sub _normalize() to provide one or the other, based on an optional parameter. In its absence it defaults to current behaviour (returns UTF-8 octets), preserving compatibility. Attached is my last version of sub _normalize(). Bug 7126: Incorrect character set detections by normalize_charset - sub _normalize() V2 Sending lib/Mail/SpamAssassin/Message/Node.pm Committed revision 1659255. -- You are receiving this mail because: You are the assignee for the bug.
