R (Chandra) Chandrasekhar wrote:
Dear Folks,
Hello,
This is a question about s///sg across lines from a file slurped in file
mode.
I am trying to change occurrences of & into &. As a minimal example,
I used the contrived file below where single- and multi-line records are
delimited by <...>. The only real-world text is a hyperlink from an
actual web site.
--------
<Pebbles & Pelicans>
<Hogsworthy Tales of a Late Summer Afternoon>
<Everything News
worthy & Printable>
# This is a comment and should be ignored
<Rough & Tumble>
<Pride & Prejudice>
<P&O Shipping Corporation>
<a
href=http://www.amazon.com/s?ie=UTF8&tag=mozilla-20&index=blended&link%5Fcode=qs&field-keywords=Programming%20Perl&sourceid=Mozilla-search>
<This record should not be
modified.>
<Nor this.>
<This is a very long line with many intervening & symbols This is a very
long line with many intervening & symbols This is a very long line with
many intervening & symbols This is a very long line with many
intervening & symbols This is a very long line with many intervening &
symbols This is a very long line with many intervening & symbols This is
a very long line with many intervening & symbols This is a very long
line with many intervening & symbols>
<sudo apt-get update && sudo apt-get upgrade>
--------
I tried this script on the above file:
Very good. Both data and code to work with. Thanks.
--------
#!/usr/bin/perl
use warnings;
use diagnostics;
use strict;
my ($fh, $file, $data, $count, @record, $record);
my (@orig, $orig, @repl, $repl, $subs, @amper);
You shouldn't declare all your variables here. It is better to declare
them in the smallest scope possible.
undef $/; # Slurp data in file mode
$file = shift;
my $file = shift;
open $fh, '<', $file or die "Cannot open $file: $!\n";
open my $fh, '<', $file or die "Cannot open $file: $!\n";
$data = <$fh>;
my $data = <$fh>;
close $fh;
$count = 0;
my $count = 0;
while ($data =~ m|<\s*?(.*?)\s*?>|gis)
I ran this using "use re 'debug';" and that expression is pretty
inefficient. Also the /i option is for a case insensitive match but
there are no characters in that pattern that have different cases. This
would seem to be more efficient:
while ( $data =~ /<\s*([^<>]*)\s*>/g )
{
$count++;
print "$count: $1\n";
push @record, $1;
}
close $fh;
$count = 0;
foreach $record (@record)
foreach my $record ( @record )
{
$count++;
if (($record !~ m|&|s) && ($record =~ m|&|s))
You are only choosing records that contain '&' but not '&' but what
if a record contained both '&' and '&'?
{
push @orig, $record;
push @amper, $count;
$subs = ($record =~ s|&|&|gis); # Should be number of
substitutions
The /s option only applies to the . meta-character and the /i option
only applies to characters that have different upper and lower case
representations. The pattern /&/ has neither of these characteristics.
Also to match records that have both '&' and '&' patterns use a
negative look-ahead:
my $subs = $record =~ s/&(?!amp;)/&/g;
if ($subs == 0) {warn "$count: No replacement made.\n";}
else {print "$count: $subs replacement(s) made.\n";}
push @repl, $record;
}
}
foreach $orig (@orig)
foreach my $orig ( @orig )
{
$subs = 0;
$repl = shift @repl;
$count = shift @amper;
my $subs = 0;
my $repl = shift @repl;
my $count = shift @amper;
$subs = ($data =~ s|$orig|$repl|gs);
You have a problem with this record:
<a
href=http://www.amazon.com/s?ie=UTF8&tag=mozilla-20&index=blended&link%5Fcode=qs&field-keywords=Programming%20Perl&sourceid=Mozilla-search>
which is not working. It is not working because the string has regular
expression meta-characters in it which don't match the literal data in
$data. ('com/s?ie' will match either 'com/sie' or 'com/ie' but not
'com/s?ie') You have to use quotemeta to get it to match correctly:
my $subs = $data =~ s/\Q$orig/$repl/g;
if ($subs == 0) {warn "$count: No replacement made.\n";} # Why?
else {print "$count: $subs replacement(s) made.\n";}
}
open $fh, '>', "$file.new" or die "Cannot open $file: $!\n";
print $fh $data;
close $fh;
John
--
Perl isn't a toolbox, but a small machine shop where you
can special-order certain sorts of tools at low cost and
in short order. -- Larry Wall
--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
http://learn.perl.org/