https://bz.apache.org/SpamAssassin/show_bug.cgi?id=8310

Kent Oyer <k...@mxguardian.net> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |k...@mxguardian.net

--- Comment #2 from Kent Oyer <k...@mxguardian.net> ---
Created attachment 5999
  --> https://bz.apache.org/SpamAssassin/attachment.cgi?id=5999&action=edit
utf8_anchor_text.diff

Thanks for the patch, however, I think we just need to make sure the anchor
text is UTF-8 encoded like the rest of the body. I've committed the attached
one-line patch which accomplishes that. I've verified that your original rule
fires if you remove the double backslashes:

uri_detail      UNICODE_LINK_TEXT text =~
/\x{E0}\x{B8}\x{97}\x{E0}\x{B8}\x{B1}\x{E0}\x{B8}\x{99}\x{E0}\x{B8}\x{97}\x{E0}\x{B8}\x{B5}/


Or, if your editor supports Unicode, you can use the actual Unicode characters:

uri_detail      UNICODE_LINK_TEXT text =~ /ต่ออายุทันที/

Just make sure to save the file in UTF-8 format.

Patch committed in revision 1923527.

-- 
You are receiving this mail because:
You are the assignee for the bug.

Reply via email to