Alexei A. Frounze wrote:
Here's a sample program to show the problems I'm having with regexps: my $s = "a\tb";
$s contains a three character string, the 'a' character, the TAB character and the 'b' character.
my $m1 = "\t";
$m1 contains a single TAB character.
my $m2 = "\\t"; # both "\t" and "\\t" are treated specially
$m2 contains two characters, the '\' character and the 't' character.
my $r1 = "\t";
Same as $m1.
my $r2 = "\\t";
Same as $m2.
my $s1 = $s; my $s2 = $s; $s1 =~ s/$m1/$r1/;
The same as: $s1 =~ s/\t/\t/; Replace a TAB character with a TAB character.
$s2 =~ s/$m1/$r2/;
The same as: $s1 =~ s/\t/\\t/; Replace a TAB character with the two character string '\t'.
print "\"$s\" =~ s/$m1/$r1/ -> \"$s1\"\n"; print "\"$s\" =~ s/$m1/$r2/ -> \"$s2\"\n"; my $s1 = $s; my $s2 = $s; $s1 =~ s/$m2/$r1/;
The same as: $s1 =~ s/\\t/\t/; Replace the two character string '\t' with a TAB character.
$s2 =~ s/$m2/$r2/;
The same as: $s1 =~ s/\\t/\\t/; Replace the two character string '\t' with the two character string '\t'.
print "\"$s\" =~ s/$m2/$r1/ -> \"$s1\"\n";
print "\"$s\" =~ s/$m2/$r2/ -> \"$s2\"\n";
$s = "aabcc";
$s1 = $s;
$s2 = $s;
my $m = "a(b)c"; # "(" is is treated specially, but "\\(" isn't
my $r = "$1";
perldoc -q quotingBecause capturing parentheses have not been used yet $1 will contain nothing so that is the same as:
my $r = "";If you had warnings enabled then you would have been warned "Use of uninitialized value in string" at this point.
You probably want either: my $r = "\$1"; Or: my $r = '$1'; to prevent interpolation.If you had used a regular expression with capturing parentheses before this point then $r would contain whatever was captured by them.
$s1 =~ s/$m/$r/; # $r (= "$1") is substituted with an empty string
The FAQ: perldoc -q "How can I expand variables in text strings" describes what you are trying to do here.
$s2 =~ s/$m/$1/; # $1 is substituted with "b" print "\"$s\" =~ s/$m/\$r/ -> \"$s1\", where \$r=\"\$1\"\n"; print "\"$s\" =~ s/$m/\$1/ -> \"$s2\"\n"; It prints: "a b" =~ s/ / / -> "a b" "a b" =~ s/ /\t/ -> "a\tb" "a b" =~ s/\t/ / -> "a b" "a b" =~ s/\t/\t/ -> "a\tb" "aabcc" =~ s/a(b)c/$r/ -> "ac", where $r="$1" "aabcc" =~ s/a(b)c/$1/ -> "abc" The first problem (as you've probably already realized from the first 4 lines of the output) is that in the search expression both the tab character "\t" and the backslash+t sequence "\\t" are treated the same, as the tab character. Why is that? The second problem is that in the replace expression the same backslash +t sequence isn't treated "\\t" the same way as in the search expression. Why is this inconsistency? Can I somehow force s/// to treat "\\t" in replace as "\t"? The third problem should be apparent from the last 2 lines of the output. I want the $1 match (and possibly a few more) to come from elsewhere (string data) and not be predetermined by the code. But it doesn't seem to work this way. Is it possible to fix this?
You may want to read the sections "Quote and Quote-like Operators", "Regexp Quote-Like Operators" and "Gory details of parsing quoted constructs" in the perlop document:
perldoc perlop John -- Those people who think they know everything are a great annoyance to those of us who do. -- Isaac Asimov -- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected] http://learn.perl.org/
