------- You are receiving this mail because: ------- You are on the CC list for the bug.
http://bugs.exim.org/show_bug.cgi?id=1419 --- Comment #9 from Zoltan Herczeg <[email protected]> 2013-12-15 09:30:33 --- I commented out the following lines in pcre_exec: if (utf) ACROSSCHAR(start_match < end_subject, *start_match, start_match++); All regression tests are passed (not surprisingly, since JIT always worked this way), and the performance is now similar in the first two cases even in the interpreter (without utf check). I also tried what happens, if the input is two or three character long utf8 code points (where this optimization should do something useful). There is a slight perf degradation, but compared to the cost of this optimization it is still much lower. The second needs more time to do (not that much though, I think it is less than one day), and should not be in the next release because of the risk of breaking things. Btw, how would you optimize this with SSE: uint8_t bitset[32]; // One bit for each character between 0..255 char* start; // In 8 bit mode: while (bit_is_set(*start, bitset) == FALSE) start++; In 16 and 32 bit modes, you need to check whether *start is greater than 255 as well. I don't plan to take your bounty, so feel free to work on this. It is funny that they did not specify the amount of speedup they expect. -- Configure bugmail: http://bugs.exim.org/userprefs.cgi?tab=email -- ## List details at https://lists.exim.org/mailman/listinfo/pcre-dev
