https://issues.apache.org/SpamAssassin/show_bug.cgi?id=7126

Mark Martinec <[email protected]> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Attachment #5271|0                           |1
        is obsolete|                            |
   Attachment #5272|0                           |1
        is obsolete|                            |

--- Comment #8 from Mark Martinec <[email protected]> ---
Created attachment 5277
  --> https://issues.apache.org/SpamAssassin/attachment.cgi?id=5277&action=edit
The suggested replacement subroutine MS::Message::Node::_normalize() - V2

In view of:

  [Bug 7133] Revisiting Bug 4046 - HTML::Parser: Parsing of undecoded UTF-8
     will give garbage when decoding entities,

  and HTML::Parser bug:
    https://rt.cpan.org/Public/Bug/Display.html?id=99755

it seems desirable to be able to obtain from sub _normalize either
decoded characters (Unicode), or encoded as UTF-8 octets,
so I have generalized the proposed replacement sub _normalize()
to provide one or the other, based on an optional parameter.
In its absence it defaults to current behaviour (returns UTF-8
octets), preserving compatibility.

Attached is my last version of sub _normalize().



Bug 7126: Incorrect character set detections
by normalize_charset - sub _normalize() V2
  Sending lib/Mail/SpamAssassin/Message/Node.pm
Committed revision 1659255.

-- 
You are receiving this mail because:
You are the assignee for the bug.

Reply via email to