------- You are receiving this mail because: ------- You are on the CC list for the bug.
http://bugs.exim.org/show_bug.cgi?id=1303 --- Comment #2 from Philip Hazel <[email protected]> 2012-10-07 18:05:39 --- On Thu, 4 Oct 2012, Jille Timmermans wrote: > When I match this regex, I would expect the second } to be included in 0. > > re> /{[A-Z]+(\|param:[^|}]*(?R)?[^|}]*)*}/ > data> A{X|param:{Y}}B > 0: {X|param:{Y} > 1: |param:{Y I am afraid you are wrong. I have now analysed this example. This is what happens: . It scans along the subject till it finds the first { . [A-Z]+ matches X . \|param: matches literally . [^|}]* matches {Y, leaving }}B . We then read (?R), which recurses . This fails because the next character is not { . However, since it is optional, we can carry on . [^|}]* matches zero characters . The final } matches the first } in }}B > If I make de (?R) mandatory, it does work as expected: That's because when it fails the first time, there is a backup into [^|}]* which causes it to try again and again, until the recursion matches. In cases like this, it is instructive to use the /C option in pcretest to see how it is matching: /{[A-Z]+(\|param:[^|}]*(?R)?[^|}]*)*}/C A{X|param:{Y}}B --->A{X|param:{Y}}B +0 ^ { +1 ^^ [A-Z]+ +7 ^ ^ (\|param:[^|}]*(?R)?[^|}]*)* +8 ^ ^ \| +10 ^ ^ p +11 ^ ^ a +12 ^ ^ r +13 ^ ^ a +14 ^ ^ m +15 ^ ^ : +16 ^ ^ [^|}]* +22 ^ ^ (?R)? +0 ^ ^ { +27 ^ ^ [^|}]* +33 ^ ^ ) +8 ^ ^ \| +35 ^ ^ } +36 ^ ^ This clearly shows that the first [^|{]* is matching {Y. If you do the same without the ? you will see it backing up. Philip -- Configure bugmail: http://bugs.exim.org/userprefs.cgi?tab=email -- ## List details at https://lists.exim.org/mailman/listinfo/pcre-dev
