------- You are receiving this mail because: ------- You are on the CC list for the bug.
http://bugs.exim.org/show_bug.cgi?id=1419 --- Comment #1 from Philip Hazel <[email protected]> 2013-12-14 16:48:31 --- I am not surprised there is a difference, but it does seem excessive. I have just tried a quick experiment using the timing facilities in pcretest to match [w-z] and [\x{fe000}-\x{fefff}] to a 300K data string that did not contain any relevant characters (so the result was no match). I am on an ancient x86 box and the compiler optimization was turned off. For [w-z] the time was 28ms and for the other case it was 35ms. I know this is crude, but it shows nothing like the difference you report. (Studying the pattern speeds things up, as of course does using JIT.) For code points < 256, a class uses a bit map, whereas for your other pattern it will be comparing character values with the range requested. The interesting question is, what is different in your environment and mine? (I used [w-z] rather than [b-z] because my file had no instances of b-z, but as that uses a bit map, it shouldn't make any difference.) Were you scanning one long string, or a number of strings from a file? Did you find a match or not? My data string contained entirely ASCII characters, which are of course quicker to load than multi-byte characters with code points > 255. (And there's always the possibility that I've not been running what I thought I was running...) -- Configure bugmail: http://bugs.exim.org/userprefs.cgi?tab=email -- ## List details at https://lists.exim.org/mailman/listinfo/pcre-dev
