https://bugs.exim.org/show_bug.cgi?id=2430
--- Comment #3 from Andreas Bergmann <andreas.bergm...@cyren.com> --- (In reply to Andreas Bergmann from comment #2) > (In reply to Philip Hazel from comment #1) > > You missed the leading [ in "aA][bB][cC]" but I assume that's just a typo in > > your posting, as it is present in your example. Investigating your patterns > > shows that this is an effect caused by an optimization that happens in one > > case, but not the other. Actually, it's an optimization that turns into a > > pessimization. For (?i)abc PCRE2 records that a match must start with "a" > > and there must be a "c" later in the source. For [aA][bB][cC] it records > > only that a match must start with "A" or "a". It seems that searching for > > "c" (which may be a long way after each "a") is taking up lots of time. > > (Note, however, that if you use JIT, the problem doesn't occur.) I will take > > a look at this - it occurs to me that searching for a "last fixed character" > > is a bit pointless unless there is something variable between it and the > > first character. Also, the search should perhaps only search so far after > > the initial character. > > > > If you turn off the optimizations with NO_START_OPTIMIZE the two patterns > > behave much the same. > > Thank you for your feedback - and true, the missing "[" is a typo. > > The reasoning makes perfect sense and I'll give the NO_START_OPTIMIZE option > a try. FYI / for the records: 1048576 bytes 0.043428 sec (?i)abc 1048576 bytes 0.000130 sec (a|A)(b|B)(c|C) 1048576 bytes 0.000094 sec [aA][bB][cC] PCRE2_NO_START_OPTIMIZE: 1048576 bytes 0.002174 sec (?i)abc 1048576 bytes 0.004067 sec (a|A)(b|B)(c|C) 1048576 bytes 0.002128 sec [aA][bB][cC] -- You are receiving this mail because: You are on the CC list for the bug. -- ## List details at https://lists.exim.org/mailman/listinfo/pcre-dev