------- You are receiving this mail because: ------- You are on the CC list for the bug.
http://bugs.exim.org/show_bug.cgi?id=1419 --- Comment #23 from Zoltan Herczeg <[email protected]> 2014-01-05 17:13:24 --- According to my measurements, all tests run with the same speed now. I also made another optimization: partial utf character decoding. The [\x{fe000}-\x{fefff}] range gave me the idea. Only four byte long UTF characters can match to this range, and there is no need to decode shorter utf byte codes. Now there is a read_char_range() function which returns with a correct character code between a minimum and maximum value. All other characters are decoded to a random value, but this value _must_ be outside the min-max range. The read_char_range generates an optimized character reading code (basically we can skip several checks). The function has a boolean update_str_ptr value, which says that the string pointer must be updated for all characters, not only for those, which are inside the range. Updating the string pointer can be done without decoding the character (and it is faster as well). This is useful for negative classes, where all character must be accepted which are outside the range. This optimization is mostly target UTF-8, but UTF-16 get some benefit as well. -- Configure bugmail: http://bugs.exim.org/userprefs.cgi?tab=email -- ## List details at https://lists.exim.org/mailman/listinfo/pcre-dev
