On Fri, 20 Jan 2012, Philip Hazel wrote: > I have committed an even simpler implementation, but I am unhappy about > the figures it's giving me on this Linux box[*], so I have made it a bit > more obscure. You have to run "pcretest -m -C" to make it show the > value. And the output now uses the word "approximate".
This version also seems fine. With my normal MSVC build it shows: PCRE version 8.30-Trunk-JAN20 2012-01-20 Compiled with 8-bit support only UTF-8 support No Unicode properties support No just-in-time compiler support Newline sequence is LF \R matches all Unicode newlines Internal link size = 2 POSIX malloc threshold = 30 Default match limit = 50000 Default recursion depth limit = 2500 Match recursion uses stack: approximate frame size = 340 bytes When compiled with Disabled optimization for debug (MSVC /Od) it shows: Match recursion uses stack: approximate frame size = 936 bytes (As before, the POSIX threshold, match limit, and recursion depth are non-standard values that I overrode in my config.h) Adding #define SUPPORT_UCP in config.h to a normal build yields: PCRE version 8.30-Trunk-JAN20 2012-01-20 Compiled with 8-bit support only UTF-8 support Unicode properties support No just-in-time compiler support Newline sequence is LF \R matches all Unicode newlines Internal link size = 2 POSIX malloc threshold = 30 Default match limit = 50000 Default recursion depth limit = 2500 Match recursion uses stack: approximate frame size = 360 bytes Having SUPPORT_UCP and Disabled optimization for debug yields: Match recursion uses stack: approximate frame size = 1184 bytes > [*] Adding more than a certain number of printf statements increased the > apparent frame size; being simple-minded, I don't really understand why. > I guess the ways of gcc are mysterious. I can only describe MSVC compilers. If you were 'simple-minded' then I'd be the blooming idiot Manuel from Fawlty Towers. Hopefully the issues you've encountered are caused by compiling with a debug mode. Anyways some PCRE users may encounter these issues and be without a clue about what recursion is or how to code for a resolution. Your putting something in place that can provide even an approximation of the real-world stack usage might at least provide a clue. It's a matter of using the pre-existing PCRE match_limit_recursion feature, setting that to fit within the particular application's available stack size. This new stack calculation may provide a reasonably accurate estimate of where to set the match_limit_recursion such that there is a meaningful PCRE error code instead of dying out with a stack fault. Granted though that taking a stack fault and then re-writing a RegEx until it doesn't fault (or throwing more stack at the problem) is a lot easier when that's an option. I think very few real-world complex expressions would ever trigger a stack fault. Even then it might not be critical to many PCRE users, they may choose not to add any code calculations to bullet-proof against it. But to me and maybe a few others what you're addressing here is very important. If you get to a point where you feel having the stack calculation introduces complexity or otherwise becomes un-maintainable, then I'm OK with going back to hacking something into new PCRE releases for my own uses. PCRE's continued ability to find the right stuff quickly and reliably is more important to everybody. Thank you Philip !! Regards, Graycode -- ## List details at https://lists.exim.org/mailman/listinfo/pcre-dev
