Hi, Is P1 means two literals? 'P' and '1'? PCRE has clever optimizations for quantifiers used on single characters: it can count them, and go back when recursion needed. The story is more difficult for other expessions, since a recursion may happened inside the subexpression.
Anyway, back references also optimized in the same way as characters, so perhaps: "(P1)*" could be rewritten to "(?:(P1)\1*)?" Regards, Zoltan Alan Lehotsky <[email protected]> írta: >I have a string that's 150,002 characters consisting of> > P1P2......P4....P9.....> > > (Thats 'P1', followed by 50k each of 'P2', 'P4', 'P9')> > and a simpleminded regex of> > (P1(P2)*(P4)*(P9)*)?> > With PCRE 7.9, this crashes in match() on the 5672'd recursive call (on a linux box with a moderately large stack)> > With the old Spencer regex, we match successfully.> > I tried the obvious things to improve this:> > 1/ replaced capturing parentheses with (?: )> > 2/ Added explicit ^$ anchors> > 3/ possessive quantifiers.> > 4/ atomic grouping> > In all cases I still stomp the stack via recursion in match().> > I assume that if I configured PCRE to not use the stack that I could make this run (although probably really slowly due to malloc/free overhead). > > Any advice?> -- > ## List details at http://lists.exim.org/mailman/listinfo/pcre-dev > -- ## List details at http://lists.exim.org/mailman/listinfo/pcre-dev
