In <[EMAIL PROTECTED]>, Richard Proctor writes
:
:TomCs perl storm has:
:
:> Figure out way to do
:>
:> /$e1 $e2/
:>
:> safely, where $e1 might have '(foo) \1' in it.
:> and $e2 might have '(bar) \1' in it. Those won't work.
:
:If e1 and e2 are qr// type things the answer might be to localise
:the backref numbers in each qr// expression.
:
:If they are not qr//s it might still be possible to achieve if the expansion
:of variables in regexes is done by the regex compiler it could recognise
:this context and localise the backrefs.
:
:Any code like this is going to have real problem with $1 etc if used later,
:use of assignment in a regex and named backrefs (RFC 112) would make this
:a lot safer.
I think it is reaonable to ask whether the current handling of qr{}
subpatterns is correct:
perl -wle '$a=qr/(a)\1/; $b=qr/(b).*\1/; /$a($b)/g and print join ":", $1, pos for
"aabbac"'
a:5
I'm tempted to suggest it isn't; that the paren count should be local
to each qr{}, so that the above prints 'bb:4'. I think that most people
currently construct their qr{} patterns as if they are going to be
handled in isolation, without regard to the context in which they are
embedded - why else do they override the embedder's flags if not to
achieve that?
The problem then becomes: do we provide a mechansim to access the
nested backreferences outside of the qr{} in which they were referenced,
and if so what syntax do we offer to achieve that? I don't have an answer
to the latter, which tempts me to answer 'no' to the former for all the
wrong reasons. I suspect (and suggest) that complication is the only
reason we don't currently have the behaviour I suggest the rest of the
semantics warrant - that backreferences are localised within a qr().
I lie: the other reason qr{} currently doesn't behave like that is that
when we interpolate a compiled regexp into a context that requires it be
recompiled, we currently ignore the compiled form and act only on the
original string. Perhaps this is also an insufficiently intelligent thing
to do.
Hugo