------- You are receiving this mail because: ------- You are on the CC list for the bug.
http://bugs.exim.org/show_bug.cgi?id=1190 --- Comment #3 from Philip Hazel <[email protected]> 2011-12-28 11:08:37 --- On Tue, 27 Dec 2011, Alan Lehotsky wrote: > Trying to match the pattern (from page 66 of Mastering Regular Expressions, > 3rd > Edition) > > (?<=\d)(?=(\d\d\d)+(?!\d) > > with the source string "1234567" > > This fails if my search loop uses the startoffset argument to pcre_exec() to > advance thru the search string (leaving the subject and length unchanged as I > find successive match points). > > But it does work if I advance the subject ptr and decrement the length, and > use > a zero for the startoffset on each call. The pcretest program has facilities for trying both of these methods, and for me it gives the same result both times: PCRE version 8.12 2011-01-15 /(?<=\d)(?=(\d\d\d)+(?!\d))/g+ 1234567 0: 0+ 234567 1: 567 0: 0+ 567 1: 567 /(?<=\d)(?=(\d\d\d)+(?!\d))/G+ 1234567 0: 0+ 234567 1: 567 0: 0+ 567 1: 567 The /g option does the startoffset thing, whereas the /G option advances the pointer after a match. The /+ option causes it to output the rest of the string that follows a match - so you can see exactly where it matches an empty string. Note also that Perl with /g also gives exactly the same results. If you are worried that the /G option is looking behind in order to give the first match (as I momentarily was), you are mistaken. That match happens when the string is passed as "1234567" - remember that when an unanchored pattern is matched there is an internal advance within the string. A match against "234567" finds only the second match: /(?<=\d)(?=(\d\d\d)+(?!\d))/+ 234567 0: 0+ 567 1: 567 -- Configure bugmail: http://bugs.exim.org/userprefs.cgi?tab=email -- ## List details at https://lists.exim.org/mailman/listinfo/pcre-dev
