Alexei A. Frounze wrote:
Here's a sample program to show the problems I'm having with regexps:
my $s = "a\tb";
$s contains a three character string, the 'a' character, the TAB
character and the 'b' character.
my $m1 = "\t";
$m1 contains a single TAB character.
my $m2 = "\\t"; # both "\t" and "\\t" are treated specially
$m2 contains two characters, the '\' character and the 't' character.
my $r1 = "\t";
Same as $m1.
my $r2 = "\\t";
Same as $m2.
my $s1 = $s;
my $s2 = $s;
$s1 =~ s/$m1/$r1/;
The same as: $s1 =~ s/\t/\t/; Replace a TAB character with a TAB character.
$s2 =~ s/$m1/$r2/;
The same as: $s1 =~ s/\t/\\t/; Replace a TAB character with the two
character string '\t'.
print "\"$s\" =~ s/$m1/$r1/ -> \"$s1\"\n";
print "\"$s\" =~ s/$m1/$r2/ -> \"$s2\"\n";
my $s1 = $s;
my $s2 = $s;
$s1 =~ s/$m2/$r1/;
The same as: $s1 =~ s/\\t/\t/; Replace the two character string '\t'
with a TAB character.
$s2 =~ s/$m2/$r2/;
The same as: $s1 =~ s/\\t/\\t/; Replace the two character string '\t'
with the two character string '\t'.
print "\"$s\" =~ s/$m2/$r1/ -> \"$s1\"\n";
print "\"$s\" =~ s/$m2/$r2/ -> \"$s2\"\n";
$s = "aabcc";
$s1 = $s;
$s2 = $s;
my $m = "a(b)c"; # "(" is is treated specially, but "\\(" isn't
my $r = "$1";
perldoc -q quoting
Because capturing parentheses have not been used yet $1 will contain
nothing so that is the same as:
my $r = "";
If you had warnings enabled then you would have been warned "Use of
uninitialized value in string" at this point.
You probably want either:
my $r = "\$1";
Or:
my $r = '$1';
to prevent interpolation.
If you had used a regular expression with capturing parentheses before
this point then $r would contain whatever was captured by them.
$s1 =~ s/$m/$r/; # $r (= "$1") is substituted with an empty string
The FAQ:
perldoc -q "How can I expand variables in text strings"
describes what you are trying to do here.
$s2 =~ s/$m/$1/; # $1 is substituted with "b"
print "\"$s\" =~ s/$m/\$r/ -> \"$s1\", where \$r=\"\$1\"\n";
print "\"$s\" =~ s/$m/\$1/ -> \"$s2\"\n";
It prints:
"a b" =~ s/ / / -> "a b"
"a b" =~ s/ /\t/ -> "a\tb"
"a b" =~ s/\t/ / -> "a b"
"a b" =~ s/\t/\t/ -> "a\tb"
"aabcc" =~ s/a(b)c/$r/ -> "ac", where $r="$1"
"aabcc" =~ s/a(b)c/$1/ -> "abc"
The first problem (as you've probably already realized from the first
4 lines of the output) is that in the search expression both the tab
character "\t" and the backslash+t sequence "\\t" are treated the
same, as the tab character. Why is that?
The second problem is that in the replace expression the same backslash
+t sequence isn't treated "\\t" the same way as in the search
expression. Why is this inconsistency? Can I somehow force s/// to
treat "\\t" in replace as "\t"?
The third problem should be apparent from the last 2 lines of the
output. I want the $1 match (and possibly a few more) to come from
elsewhere (string data) and not be predetermined by the code. But it
doesn't seem to work this way. Is it possible to fix this?
You may want to read the sections "Quote and Quote-like Operators",
"Regexp Quote-Like Operators" and "Gory details of parsing quoted
constructs" in the perlop document:
perldoc perlop
John
--
Those people who think they know everything are a great
annoyance to those of us who do. -- Isaac Asimov
--
To unsubscribe, e-mail: beginners-unsubscr...@perl.org
For additional commands, e-mail: beginners-h...@perl.org
http://learn.perl.org/