Alexei A. Frounze wrote:
Here's a sample program to show the problems I'm having with regexps:
my $s = "a\tb";

$s contains a three character string, the 'a' character, the TAB character and the 'b' character.

my $m1 = "\t";

$m1 contains a single TAB character.

my $m2 = "\\t"; # both "\t" and "\\t" are treated specially

$m2 contains two characters, the '\' character and the 't' character.

my $r1 = "\t";

Same as $m1.

my $r2 = "\\t";

Same as $m2.

my $s1 = $s;
my $s2 = $s;
$s1 =~ s/$m1/$r1/;

The same as: $s1 =~ s/\t/\t/; Replace a TAB character with a TAB character.

$s2 =~ s/$m1/$r2/;

The same as: $s1 =~ s/\t/\\t/; Replace a TAB character with the two character string '\t'.

print "\"$s\" =~ s/$m1/$r1/ -> \"$s1\"\n";
print "\"$s\" =~ s/$m1/$r2/ -> \"$s2\"\n";

my $s1 = $s;
my $s2 = $s;
$s1 =~ s/$m2/$r1/;

The same as: $s1 =~ s/\\t/\t/; Replace the two character string '\t' with a TAB character.

$s2 =~ s/$m2/$r2/;

The same as: $s1 =~ s/\\t/\\t/; Replace the two character string '\t' with the two character string '\t'.

print "\"$s\" =~ s/$m2/$r1/ -> \"$s1\"\n";
print "\"$s\" =~ s/$m2/$r2/ -> \"$s2\"\n";

$s = "aabcc";
$s1 = $s;
$s2 = $s;
my $m = "a(b)c"; # "(" is is treated specially, but "\\(" isn't
my $r = "$1";

perldoc -q quoting

Because capturing parentheses have not been used yet $1 will contain nothing so that is the same as:

my $r = "";

If you had warnings enabled then you would have been warned "Use of uninitialized value in string" at this point.

You probably want either:

my $r = "\$1";

Or:

my $r = '$1';

to prevent interpolation.

If you had used a regular expression with capturing parentheses before this point then $r would contain whatever was captured by them.

$s1 =~ s/$m/$r/; # $r (= "$1") is substituted with an empty string

The FAQ:

perldoc -q "How can I expand variables in text strings"

describes what you are trying to do here.

$s2 =~ s/$m/$1/; # $1 is substituted with "b"
print "\"$s\" =~ s/$m/\$r/ -> \"$s1\", where \$r=\"\$1\"\n";
print "\"$s\" =~ s/$m/\$1/ -> \"$s2\"\n";

It prints:
"a      b" =~ s/        /       / -> "a b"
"a      b" =~ s/        /\t/ -> "a\tb"
"a      b" =~ s/\t/     / -> "a b"
"a      b" =~ s/\t/\t/ -> "a\tb"
"aabcc" =~ s/a(b)c/$r/ -> "ac", where $r="$1"
"aabcc" =~ s/a(b)c/$1/ -> "abc"

The first problem (as you've probably already realized from the first
4 lines of the output) is that in the search expression both the tab
character "\t" and the backslash+t sequence "\\t" are treated the
same, as the tab character. Why is that?

The second problem is that in the replace expression the same backslash
+t sequence isn't treated "\\t" the same way as in the search
expression. Why is this inconsistency? Can I somehow force s/// to
treat "\\t" in replace as "\t"?

The third problem should be apparent from the last 2 lines of the
output. I want the $1 match (and possibly a few more) to come from
elsewhere (string data) and not be predetermined by the code. But it
doesn't seem to work this way. Is it possible to fix this?

You may want to read the sections "Quote and Quote-like Operators", "Regexp Quote-Like Operators" and "Gory details of parsing quoted constructs" in the perlop document:

perldoc perlop


John
--
Those people who think they know everything are a great
annoyance to those of us who do.        -- Isaac Asimov

--
To unsubscribe, e-mail: beginners-unsubscr...@perl.org
For additional commands, e-mail: beginners-h...@perl.org
http://learn.perl.org/


Reply via email to