I've been lurking a few days now, and RFC 72 piqued my interest.  I see the
motivation for a backwards-moving regexp engine, but am uncomfortable with
the details.

First worry is the syntax proposed.  I cringe when I see the regexp being
expressed such that "(?r)EDCB" matches "BCDE".  That and the jumping between
the left-end of the match and the right-end of the match make for a
near-unreadable regexp.

> As a frivolous illustration, the string
> ABCDEFGHIJKLM
> would be matched by:
>     m/FG(?r)EDCB(?f)HIJK(?r)A^(?f)LM$/

Can this be repackaged in such a way that it is a more natural extension of
the existing regexp language?

The RFC notes that the look-behind construct (?<= pattern) can almost be
used.  Two issues:  1. as currently implemented, the pattern must be of
fixed length.  2. this is a zero-width assertion.

Speculation says the fixed length limitation was done because it offered a
relatively quick hack.  A fixed length pattern allows you to go back in the
matched-against string that many characters and match the pattern forwards.
If the regexp engine could "go backwards", then the fixed-length restriction
would be lifted.

The zero-width assertion might be an issue.  The RFC's example doesn't
really get into this.

> Imagine a very long input string containing data such as this:
>     ... GCAAGAATTGAACTGTAG ...
> If you want to match text that includes the string GAAC, but only when it
> follows GAATT or any one of a large number of other different
possibilities,

If it important to be able to do both:

  $large = join '|', @possible'
  $data =~ / (?<= $large) GAAC /x;   # Don't care which @possible?

and

  $data =~ m/ ($large) GAAC /x;   # Need $1 to say which @possible

Then perhaps a back-reference-setting look-behind could be implemented?
Don't have an obvious syntax to use (back-tick == back-reference?), but
something like:

  $data =~ m/ (?`<= $large) GAAC /x;   # Need $1 to say which @possible


Does this ehanced look-behind satisfy the RFC's needs?

  = mike "looking for a sig" mulligan

Reply via email to