R. Joseph Newton wrote:
> Rob Dixon wrote:
>>
>>     print my $t="href=\"(.*?)\"><IMG.*?border=0>";
>> outputs
>>     href="(.*?)"><IMG.*?border=0>
>
> Hmmmm.  this gets me thinking.  How does that translate into a regex
> when such a variable is passed in?  Would Perl do the escaping and
> feed the regex something like:
>
> /href="\(\.\*\?\)\"><IMG\.\*\?border=0>/
> then?

Hi Joseph. No, quite the contrary in fact, you have to beware of two
stages of interpolation which can get very confusing. You have to
remember that there are two sorts of escaping required in a regex
string. There's the one that's necessary to disable the initial
interpolation in double-quote context, such as using

    \$variable  \\n

And there's the other one that protects regex metacharacters from
being interpreted as anything other than ordinary characters for the
purposes of the search, like

    \^  \.  \|  \$  \+  \*

Consider the situation where $s holds the two-character string '\n'.
If we define a pattern to match this as:

    my $pat = "\n";
    print "MATCH\n" if ($s =~ /$pat/);

then clearly $pat contains just the single newline character "\x0A"
which will not match either of the characters in $s and so will fail.

If we set $pat to "\\n", then the resulting string is two characters
long, and the test ($s eq $pat) would succeed. However, if we try

        print "MATCH\n" if ($s =~ /$pat/);

then it will fail, because the test is the same as $s =~ /\n/, which
is expanded a second time to be the same as $s =~ /\x0A/.

With three backslashes, $pat = "\\\n", we get a two-character
string containing a backslash an a newline, which again cannot
match our $s.

Finally, with four backslashes, $pat = \\\\n, the result is that $pat
is three characters long, two backslashes and an en. The test now
becomes $s =~ /\\n/ which, after a second interpolation, will
correctly match the two-character string '\n'.

In all, it's best to use qr// to define independent regex
expressions and use single quotes to avoid the initial
interpolation.

    $pat = qr"\\n";

so you can then use $s =~ $pat with the exact same
effect as $s =~ /\\n/.

Hope that's clear. Cheers,

Rob




-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to