Philip Hazel wrote: > On Sat, 22 Aug 2009, Sheri wrote: > > >> I didn't have it either. Just installed v5.10.0 >> >> The following prints abc-def-ghi >> >> #!/usr/bin/perl >> >> $d = "abcdefghi" ; >> $d =~ s/abc\K|def\K/-/g ; >> >> print $d, "\n"; >> > > I have just committed a patch that deals with this issue. It is a > typical case of feature addition having unintended consequences. The > problem, in PCRE, is not a bug in the use of \K; it is concerned with > matching empty strings. Perl has special rules for what happens with /g > after an empty string match. It can't behave as normal, of course, > because it would just loop for ever. PCRE itself has no /g feature, but > pcretest tries to emulate Perl. What it did was to re-run the pattern at > the matching point, with PCRE_ANCHORED and PCRE_NOTEMPTY set. This made > it look for a non-empty match at the point where an empty match had > previously succeeded. If this failed, it moved on by one character. > > Now, until the advent of \K, the only point in a subject string at which > an anchored pattern could match an empty string was at the start, so > this worked fine, and did exactly what Perl did. The arrival of \K > changes this: your pattern matches an empty string not at the start, and > this is still true even if it is anchored. Such a match should *not* be > ignored when processing /g, but PCRE_NOTEMPTY ignores all empty matches. > > In order to make this work, I have had to invent a new PCRE option > called PCRE_NOTEMPTY_ATSTART, which locks out an empty match only at the > start of the subject string. An empty match further into the string is > accepted. If the pattern is anchored, the two options differ in their > behaviour only if \K is used. (An alternative would have been to modify > the behaviour of PCRE_NOTEMPTY, but that would have been an incompatible > change, and these always cause grief to someone.) > > Now pcretest uses the new option, and behaves like Perl. This issue has > pushed me into installing Perl 5.10 for testing purposes (in parallel > with Perl 5.8 so I can try both) which is probably a Good Thing. I'm > going to re-arrange some of the tests so that proper check of all of > them against 5.10 can be done. > > Philip > > Interesting, thanks for the detailed explanation. Seems odd however that a lookbehind version works in 7.9?
re> /(?<=abc)|(?<=def)/g data> abcdefghi 0: 0: data> I understand why you are making another option, but it sounds like as a result all user apps that do multiple matching (and the C++ module) will need to be modified to benefit. In fact if using a shared library, it will need to be processed one way if using a version less than 8.0 and another if using 7.9 and earlier. Have you considered giving the new option value to the old functionality? Regards, Sheri -- ## List details at http://lists.exim.org/mailman/listinfo/pcre-dev
