------- You are receiving this mail because: ------- You are on the CC list for the bug.
http://bugs.exim.org/show_bug.cgi?id=1347 --- Comment #4 from Philip Hazel <[email protected]> 2013-04-17 16:56:03 --- On Tue, 16 Apr 2013, [email protected] wrote: > > This is not a bug, though it is frequently mis-reported as one, so much > > so that I keep this standard response on file: > Unfortunately that's the default behaviour for most distributions > causing quite some hard to debug crashes as (see stacktrace of my > original report) there's not much relation between e.g. the executed PHP > code and the actual crash down in libpcre - thus I'm getting quite > regularly reports on crashes where there's not much debugging that can > be done. In this case I got quite lucky as there's a clear sample > causing the trouble where even the code to reproduce is quite minimal. As you probably realize, I work only on PCRE. I don't use PHP and know nothing about it. All those PHP functions in your backtrack don't mean anthing to me, I'm afraid. Nor do I know anything about GeSHi or Lua. In order to check for bugs, I have to have the pattern and subject string that is being matched. Sorry if that sounds harsh! > But even than I think here's the best way to workaround the issue. How > much work would it be to allow for libpcre to be compiled ina kinda > "hybrid mode" where stack-based matching is tried first and if reaching > the (stack) recursion limit a retry using the heap-based implementation > is performed with much higher limits? A client program could do that for itself. It could compile two separate PCRE libraries (for each data size - 8, 16, 32 - that it needs) and when a call to the stack-based one returned with a recursion limit exceeded, it could call the other library. [To avoid name clashes, there would have to be some fiddling, probably involving macros, when compiling the two libraries. Sorting our this detail would involve some work, I guess.] > Is there a way to get some "stats" for a regexp like max recursion > depth, taken branches, steps needed for solution, ...? The pcretest program has a facility for determining the max recursion depth, given a pattern and a subject string. Grep for \M in the pcretest man page. > > (a) Work on your regular expression pattern so that it uses less > > memory. Sometimes using atomic groups can help with this. > Will need some debugging aid for this. pcretest is your friend. > Some questions on that: > - - Do you know if PHP uses the pcrejit option? And if yes, since around > which version? Sorry, I know nothing about PHP. > - - Given something like ^\d+(?:(?:A|B|BC|BD|C|DB)+\d+)*$ What would be > the best way to ask libpcre to try various combinations of the letters > while avoiding backtracking once one such consecutive group has been found? Atomic groups and possessive quantifiers. For example, you probably should write \d++ instead of \d+ above. Given a run of digits, there is no point trying with all possible numbers of them, since what follows is never a digit. [PCRE can sometimes convert \d+ into \d++ for you automatically, but not in that example. It only does it when the very next item is obviously not a digit.] Again, if you write (?:A|B|BC|BD|C|DB)++ once it has found A it will carry on to look for a digit, but if there isn't one, it won't look for other letters. But that means you should re-order the alternatives, as otherwise it will never find BC (for example) because it will find B first. Regards, Philip -- Configure bugmail: http://bugs.exim.org/userprefs.cgi?tab=email -- ## List details at https://lists.exim.org/mailman/listinfo/pcre-dev
