In RFC 72, Peter Heslin gives this example:
:Imagine a very long input string containing data such as this:
:
:    ... GCAAGAATTGAACTGTAG ...
:
:If you want to match text that matches /GA+C/, but not when it
:follows /G+A+T+/, you cannot at present do so easily.

I haven't tried to work it out exactly, but I think you can
achieve this (and fairly efficiently) with something like:
  /
    (?: ^ |                      # else we won't match at start
      (?: (?> G+ A+ T+) | (.) )*
      (?(1) | . )
    )
    G A+ C
  /x

This requires that the regexp engine reliably leaves $1 unset if
we took the G+A+T+ branch last time through the (...)*, which
has been an area of many bugs and no little discussion in perl5;
I'm not sure of the status of that in latest perls.

It isn't particularly relevant to this proposal since there are
other combinations that can't be resolved in this way; I thought
it might be of interest nonetheless.

Hugo

Reply via email to