------- You are receiving this mail because: ------- You are on the CC list for the bug.
http://bugs.exim.org/show_bug.cgi?id=1419 Summary: PCRE slower at matching UTF8 character classes. Product: PCRE Version: 8.33 Platform: Other OS/Version: All Status: NEW Severity: wishlist Priority: medium Component: Code AssignedTo: [email protected] ReportedBy: [email protected] CC: [email protected] The following benchmark shows that PCRE is much slower at matching a regex that has utf8 characters in the regular expression than when only ASCII characters are in the expression, even if the string itself is purely ASCII. Without the jit this slowdown is quite dramatic: a regex that took 200 ms to run takes 5.2 seconds when UTF8 characters are matched. With the JIT, the slowdown is much less, but still quite noticeable. [b-z] no utf no jit 191 ms [b-z] utf no jit 326 ms [\x{fe000}-\x{fefff}] utf no jit 5288 ms [b-z] no utf jit 137 ms [b-z] utf jit 229 ms [\x{fe000}-\x{fefff}] utf jit 400 ms This benchmark was run on an Intel E5-2660 CPU in 64 bit with PCRE compiled from trunk. -- Configure bugmail: http://bugs.exim.org/userprefs.cgi?tab=email -- ## List details at https://lists.exim.org/mailman/listinfo/pcre-dev
