p...@highdeck.com wrote:
Hi all,
Hello,
I have some basic code that I want to pull out the web addresses from web pages.
Would like to keep it as basic as possible for easy reading.
The line to replace http with newline seems to work ok.
however the "match" line doesnt seem to pull out the required lines
and I'm not to sure about the split either.
I think it's this line "if ($serverlist[$index] =~ /:\/\//)"
that's not giving me what I want.
If I comment out the program after $serverlist[$index] =~ s/http/\n/g;
and then pipe it to grep eg
./webprog.pl |grep ://|cut -d"/" -f3
I get the desired output, to a point, but I'd like to do it all in perl
Thanks for your time.
Alan.
#!/usr/bin/perl
#
#
# Build Initial list and put into array.
@serverlist = `/usr/bin/wget -q -O - http://www.anyserver.com`;
for ($index = 0; $index <= $#serverlist; $index++) {
#replace http with newline, all .com etc should now be in 3rd field "/"
$serverlist[$index] =~ s/http/\n/g;
#pull out all lines with :// like "grep"
# as these should contain web addresses.
if ($serverlist[$index] =~ /:\/\//) # does not seem to do what it should
{
#print $serverlist[$index];
# pull out 3rd field eg. ://my.server.com/
print ((split/\//, $serverlist[$index])[2]); # like cut -d"/"
-f3
# should now be my.server.com
}
}
It works for me:
$ /usr/bin/wget -q -O - http://www.anyserver.com | grep '://' | cut
-d"/" -f3 | wc
97 93 1656
$ perl -le'
my @serverlist = `/usr/bin/wget -q -O - http://www.anyserver.com`;
for ( my $index = 0; $index <= $#serverlist; $index++ ) {
$serverlist[ $index ] =~ s/http/\n/g;
if ( $serverlist[ $index ] =~ /:\/\// ) {
print( ( split /\//, $serverlist[ $index ] )[ 2 ] );
}
}
' | wc
98 94 1653
Or better as:
$ perl -le'
my @serverlist = `/usr/bin/wget -q -O - http://www.anyserver.com`;
for ( @serverlist ) {
if ( /:\/\// ) {
print( ( split /\// )[ 2 ] );
}
}
' | wc
97 93 1656
John
--
Those people who think they know everything are a great
annoyance to those of us who do. -- Isaac Asimov
--
To unsubscribe, e-mail: beginners-unsubscr...@perl.org
For additional commands, e-mail: beginners-h...@perl.org
http://learn.perl.org/