I'm trying to understand how multiple quantifiers in a 
regular expressions get massaged to fit the data.
in other words.
 
my $string = "was not was not was not the time was not was not was not was not 
";
 
if($string =~ m{(.*)was(.*)the time}) {
     print "matched\n";
     print "1 is '$1'\n"; 
     print "2 is '$2'\n";
}
 
The first (.*) swallows the whole string,
and fails when it sees "was" in the regular
expression, so it backs up until it can find
a "was" in the string, and then it fails when
it looks for "the time", so it backs up again
until it finds a "was" followed by "the time".
 
I know it "just works", but I'm trying to figure
out exactly how it behaves under the hood.
I think if I were to write it in pseudocode,
it would look something like this:
 
foreach my $quantifier ( @list_of_quantifiers ) { # ($1 .. $n)
     while(nomatch) {
          $quantifier->reducecaptureby1andtryagain;
     }
 
But I think there might be other ways of 
implementing it that would be subtly diffferent.
 
anyone?
 
 
_______________________________________________
Boston-pm mailing list
[email protected]
http://mail.pm.org/mailman/listinfo/boston-pm

Reply via email to