On Tue, 22 Apr 2014, Jean-Christophe Deschamps wrote:

> How do we differentiate between an unused capturing group and a pseudo-match
> resulting from a DEFINE?
> 
> For instance and in Perl format, the following patterns give me the same
> result on input 'bbb' even when the DEFINE is not actually used:
> /(a)? (b+)/x
> /(?(DEFINE) (?<head> xyz)) (b+)/x
> 
> In both cases I get:
> ''     (an empty capture)
> 'bbb'

Using pcretest:

PCRE version 8.36-RC1 2014-04-21

/(a)? (b+)/x
bbb
 0: bbb
 1: <unset>
 2: bbb

/(?(DEFINE) (?<head> xyz)) (b+)/x
bbb
 0: bbb
 1: <unset>
 2: bbb

That is, both give the same result. How does it know that group 1 is 
unset? Answer: the start and end offsets are both set to -1.

> So I'd like to point out how to prevent the bug in this particular
> implementation, in order to simplify dev job. I suspect it has to do with how
> ovector entries are interpreted but from the PCRE docs it seems both empty
> group and DEFINE return (-1, -1) well, provided I read the docs correctly.

Yes, that's right. So I guess the answer to your original question is 
that there is no way to tell the difference. The DEFINE group is a 
numbered group, but will always be unset. The same result occurs if you 
use {0} to specify a zero repetition for a group, for example, (a){0} 
instead of (a)? in your first example.

Philip

-- 
Philip Hazel

-- 
## List details at https://lists.exim.org/mailman/listinfo/pcre-dev 

Reply via email to