On Tue & Thu, 15 & 17 Sep 2009, I wrote: > If you write the alternatives the other way round, it works as you > expect: > > /^((.)(?1)\2|.?)$/
Er, actually, it doesn't work if there are an even number of characters in the subject. > I think I will use this example to make the point more clearly in the > documentation. ... and while I did so, I figured out exactly what is going on. This is a difference between the way PCRE (and Python) handle recursive patterns, and Perl. In PCRE, they are always treated as atomic groups. This means that when a lower recursion has matched something, it cannot be re-entered to try another alternative if there is a failure higher up. That's why the pattern above doesn't work: it matches a character at the lowest level, but cannot come back to match an empty string when the subject has an odd number of characters. > I don't know how it works, but it recognizes palindromes > such as "A man, a plan, a canal, Panama!". > > /^\W*(?:((.)\W*(?1)\W*\2|)|((.)\W*(?3)\W*\4|\W*.\W*))\W*$/i I do now know how it works. It is essentially the same as the pattern above, but with the even and odd cases written out separately, so that the alternation happens at the outer level. > Hm, I see that *this* pattern also makes Perl 5.10 loop. Well, strictly, not "loop", but "take an inordinate amount of time". It works fine if you use possessive quantifiers for the "non-word" characters: /^\W+(?:((.)\W+(?1)\W+\2|)|((.)\W+(?3)\W+\4|\W+.\W+))\W+$/i I did some measurements in pcretest, and the possessive quantifier speeds some cases up by a factor of 10 or more. Philip -- Philip Hazel -- ## List details at http://lists.exim.org/mailman/listinfo/pcre-dev
