On Dec 14, Deven T. Corzine said:
>The crux of the problem is that non-greedy qualifiers don't affect the
>"earliest match" behavior, which makes the matches more greedy than they
>really ought to be.
That's because "greediness" is just a measure of crawl vs. backtrack. The
regex /a.*b/ will match 'a', and as many non-\n characters as possible,
and then look for a 'b'. Upon failing, it will back up one character. On
the other hand, /a.*?b/ matches an 'a', and then 0 characters, and then
tries to match a 'b', and upon failing matches another character, etc.
> $_ = "aaaabbbbccccddddeeee";
> ($greedy) = /(b.*d)/; # "bbbbccccdddd" (correct)
> ($non_greedy) = /(b.*?d)/; # "bbbbccccd" (should be "bccccd"!)
>
>Does anyone disagree with the premise, and believe that "bbbbccccd" is the
>CORRECT match for the non-greedy regexp above?
> match as many times as possible (given a particular starting
> location) while still allowing the rest of the pattern to match.
The starting location is the first 'b' it matches. Greediness has nothing
to do with the 'b' in your regex -- it has to do with the '.'. The engine
matches a 'b', and then starts working on 0 or more of anything.
You're asking for something like
/(?<!b)(b.*?d)/
which is an "optimization" you'll have to incorporate on your own.
--
Jeff "japhy" Pinyan [EMAIL PROTECTED] http://www.pobox.com/~japhy/
CPAN - #1 Perl Resource (my id: PINYAN) http://search.cpan.org/
PerlMonks - An Online Perl Community http://www.perlmonks.com/
The Perl Archive - Articles, Forums, etc. http://www.perlarchive.com/