https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7656

--- Comment #5 from Henrik Krohns <apa...@hege.li> ---
I tried performance tests with mass-check, there's absolutely no difference
here for normalize_charset, total duration was always within normal +-2%
variance.

Rule differences between these were mainly:

__HIGHBITS
MPART_ALT_DIFF_COUNT
TVD_SPACE_RATIO
__freemail_safe_fwd

As we can see from __freemail_safe_fwd, if normalize is on, we can't assume
that a single dot will match a character like "รค".. committed (?:\xe4|\xc3\xa4)
fix for it. Question arises whether regexes should be run with unicode
semantics (. = single character) instead of matching raw bytes.

Have to investigate if the others need fixing.

-- 
You are receiving this mail because:
You are the assignee for the bug.

Reply via email to