I did more investigation:

Perl:
/(?:(?:(a)b)?\1)+/ matches abaa
/(?:(?:(ab))?\1)+/ does not match ababab

These pattern / input pairs match in PCRE2. I am pretty sure (?:(P))? is 
rewritten to ((?:P)?) in Perl, which is valid in some cases, but not in all 
cases. ND I think you have found a pretty nice Perl bug, maybe you could report 
it to them.

Regards,
Zoltan

-------- Eredeti levél --------
Feladó: Zoltán Herczeg < hzmes...@freemail.hu (Link -> 
mailto:hzmes...@freemail.hu) >
Dátum: 2021 június 6 07:21:30
Tárgy: Re: [pcre-dev] Capture not reset inside recursion
Címzett: Pcre-dev@exim.org < nad...@mail.ru (Link -> mailto:nad...@mail.ru) >
The title is misleading, that feature is a JavaScript thing:
/(?:(a)b|\1)+/ matches aba in Perl, but not in JavaScript.
Anyway it looks like the problem here is ()? clears the capturing bracket in 
Perl when the empty case is selected while restores its previous value in PCRE2.
Matching /(?:(a)??b)+/ to abb also has this difference: the capturing bracket 
is empty in Perl, while set to a in PCRE2.
Even more interesting that /(?:(?:(a))??\1)+/ only matches to aa as well, while 
the body of the ?? should not be matched in the second iteration.
Let's do some debugging:
Match /(?:(?{ print "<$1>" })(?:(a))??(?{ print "[$1]" })\1)+/ to aaa
Output:
<>[][a]<a>[][a]
It the second iteration, the capturing bracket contains a before the ?? is 
executed, and reset to nothing after.
You will not belive this, but /(?:(?:(?{ print "!" })(a))?\1)+/ matches to aaa 
similar to PCRE2. The code block should have zero effect on the matching, still 
it disables something (probably an optimization) and works as expected.
Is this a perl bug?
Regards,
Zoltan
 
-------- Eredeti levél --------
Feladó: ND via Pcre-dev < pcre-dev@exim.org (Link -> mailto:pcre-dev@exim.org) >
Dátum: 2021 június 6 00:44:08
Tárgy: [pcre-dev] Capture not reset inside recursion
Címzett: Pcre-dev@exim.org (Link -> mailto:Pcre-dev@exim.org)
Here is pcretest listing:
PCRE2 version 10.35 2020-05-09
/(?:(a)?\1)+/
aaa
0: aaa
Expected result:
0: aa
Perl result:
0: aa
--
## List details at https://lists.exim.org/mailman/listinfo/pcre-dev
-- 
## List details at https://lists.exim.org/mailman/listinfo/pcre-dev 

Reply via email to