On Sat, 2 Sep 2000 15:16:20 -0400, Peter Heslin wrote:

>> This looks more natural to me:
>> 
>>      /(?`!G+A+T+)GA+C/

>Your version is closer to the way lookbehind works now, so this syntax
>might be thought to be clearer; I should add to the RFC an explicit
>note about this.

Look at your original requirement:

>>If you want to match text that matches /GA+C/, but not when it
>>follows /G+A+T+/,

I find that "my" syntax most closely matches this requirement.

The reason why I find it clearer is because it states WHAT it should
match, not HOW it should match.

>Perhaps the same functionality I want could be
>achieved with your syntax.  Would /(?`!G+A+T+)(?:GA+)C/ mean "Match GA+,
>then do the lookbehind, then match C"?  Is so, I'd be happy with that.

I have the feeling that you're looking too close to the implementation
details. Lookbehind of a more complicated kind, like this one, should
*automatically* be postponed until a time where it would make sense. At
least, this thing should find a "G", of even "GA", before even
reconsidering that it must be preceded by a "T", let alone something
that matches /G+A+T+/.

But, that is not your problem. That is a problem of regex optimization.

I feel that your originally proposed syntax is weird. Look at this
variation:

        /GA+(?:C(?!(?r)T+A+G+)|T(?!(?r)G+A+C+))/

which says: /GA+C/ but not preceded with /T+A+G+/, or /GA+T/ but not
preceded with /G+A+C+/.

The discrepancy between *where* this is specified, and where it should
match, really bugs me.

Here's my version:

        /(?`!T+A+G+)GA+C|(?`!G+A+C+)GA+T/

You gain nothing, you loose clarity.

p.s. /(?`!T+A+G+)/ does not mean: must match something that doesn't
match /T+A+G+/. Instead, if something, *anything*, can match it, the
whole match fails.

-- 
        Bart.

Reply via email to