I wrote:
>More generally, it seems to me that you're hung up on the
>description of "*?" as "shortest possible match". That's an
>ambiguous simplification of what "*?" means. It might better be
>described as "match until you find a match for the rest of the
>regex" ('d' in your example). If oversimplifications in the
>documentation led you to believe that "*?" meant something it was
>never intended to mean, then perhaps the documentation should be
>clarified.
I should have added that when I first came across non-greedy regexes,
I made exactly the same erroneous assumption (that assumption being,
"*?" finds the shortest possible match, or at least the shortest
possible local match). Once I learned the actual meaning, I realized
that is was more sensible than my initial naive interpretation.
Deven seems to be advocating thinking about regular expressions
without worrying too much about the implementation, even at a fairly
abstract level. (By abstract level, I mean something like "keep
matching non-newlines, until you come to a 'd'".) I think this is a
serious mistake. In my younger, less experienced days, I showed
great talent for writing regexes which, if not
heat-death-of-the-universe-slow, were at least inefficient enough to
exhaust Perl's available memory. This happened when attempting to do
fairly innocuous things, like extract the headers of a email message.
Moral: Ignore the underlying implementation at your peril.