I also appreciated this solution but for a different reason...it made me look 
up the $- and $+ variables.  Very cool.  I think it would be well worth the 
time for me (and any beginner) to give perldoc perlvar a thorough read...

Cheers everyone,

Nathanael

>===== Original Message From "Kipp, James" <[EMAIL PROTECTED]> =====
>Thanks. I took a  look at your site and book and found the chapter on look
>ahead. realized how much i was underutilizing them and they could have saved
>me alot of headaches. !!
>
>> -----Original Message-----
>> From: Jeff 'japhy' Pinyan [mailto:[EMAIL PROTECTED]]
>> Sent: Friday, October 04, 2002 11:20 AM
>> To: Kipp, James
>> Cc: [EMAIL PROTECTED]
>> Subject: RE: Reg Exp
>>
>>
>> >>   $dna =~ m{
>> >>     (?=
>> >>       tag
>> >>       (?:
>> >>         .*? tag
>> >>         # the substr(...) is there to avoid using $&
>> >>         (?{ push @matches, substr($dna, $-[0], $+[0] - $-[0]) })
>> >>       )+
>> >>     )
>> >>     (?!)
>> >>   }x;
>>
>> First of all, I haven't benchmarked, and I had thought of doing the
>> index() and substr() as approach that J. Krahn demonstrated.
>>
>> The regex uses (?= ... ) to look ahead, so it can match stuff without
>> consuming it.  Here's an example of what I mean:  if I have a string
>> "ABCADEFA", and I want all chunks of "A...A", if the regex actually
>> CONSUMES the "ABCADEFA", then it will have to start after the last A,
>> meaning I've missed embedded "ADEFA" chunk.  By using a
>> look-ahead, I can
>> match text while staying where I am in the string.  Compare:
>>
>>   print "japhy" =~ /(..)/g;
>>
>> with
>>
>>   print "japhy" =~ /(?=(..))/g;
>>
>> Next, to get all the "tag...tag" chunks of varying lengths, I use
>>
>>   /tag(?:.*?tag)+/
>>
>> which matches "tagAtag", "tagAtagBtag", "tagAtagBtagCtag", and so on.
>>
>> The real magic is the code block (?{ ... }) that does the dirty work.
>> First of all, substr($DNA, $-[0], $+[0] - $-[0]) is just a way of
>> accessing $& without incurring the penalties associated with
>> it.  So let's
>> just use $& for now.  The code (push @matches, $&) is
>> executed after every
>> point that the regex has matched up to an occurence of "tag", so in
>>
>>   tagTHIStagTHATtagTHOSEtag
>>
>> it'll happen at:
>>
>>   tagTHIStag X
>>   tagTHIStagTHATtag X
>>   tagTHIStagTHATtagTHOSEtag X
>>          tagTHATtag X
>>          tagTHATtagTHOSEtag X
>>                 tagTHOSEtag X
>>
>> those six locations.  The last thing in the regex is the
>> (?!), which is a
>> negative look-ahead for nothing, which ALWAYS fails.  This forces the
>> regex to backtrack, so I get all the matches.
>>
>> --
>> Jeff "japhy" Pinyan      [EMAIL PROTECTED]
>> http://www.pobox.com/~japhy/
>> RPI Acacia brother #734   http://www.perlmonks.org/
>http://www.cpan.org/
>** Look for "Regular Expressions in Perl" published by Manning, in 2002 **
><stu> what does y/// stand for?  <tenderpuss> why, yansliterate of course.
>[  I'm looking for programming work.  If you like my work, let me know.  ]
>
>
>
>--
>To unsubscribe, e-mail: [EMAIL PROTECTED]
>For additional commands, e-mail: [EMAIL PROTECTED]

"Ain't no blood in my body, it's liquid soul in my veins"
~Roots Manuva


-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to