On Sat, Sep 02, 2000 at 01:52:09PM +0200, Bart Lateur wrote:
> On 1 Sep 2000 20:50:20 -0000, Perl6 RFC Librarian wrote:
>
> >Imagine a very long input string containing data such as this:
> >
> > ... GCAAGAATTGAACTGTAG ...
> >
> >If you want to match text that matches /GA+C/, but not when it
> >follows /G+A+T+/, you cannot at present do so easily. Under this
> >proposal, you might be able to it like this: /GA+C(?!(?r)T+A+G+)/, or
> >under the alternative syntax: /GA+C(?`!G+A+T+)/.
>
> I think the location of the lookbehind stuff inside the regex is weird.
> This looks more natural to me:
>
> /(?`!G+A+T+)GA+C/
This is similar to the point Mark Dominus raised regarding the earlier
RFC: that, strictly speaking, you only need one lookbehind per regex. I
thought that it was more general to allow the lookbehind anywhere inside
the regex, and to be able to use it multiple times. Your regex would
work, too, under this proposal, but the ability to specify at what
point(s) exactly in the matching process the lookbehind should occur
seemed to me a desireable feature that might come at no extra cost.
Your version is closer to the way lookbehind works now, so this syntax
might be thought to be clearer; I should add to the RFC an explicit
note about this. Perhaps the same functionality I want could be
achieved with your syntax. Would /(?`!G+A+T+)(?:GA+)C/ mean "Match GA+,
then do the lookbehind, then match C"? Is so, I'd be happy with that.
Peter