I am slowly making my way through the process of scraping the data behind
a form and can now get five results plus a series of links using the
script below. I need help in doing the following: 1) Eliminating all
material on the page other than the list and the links (and ultimately
eliminate the link numbers); 2) following the links so that the five
listings behind each link are returned; 3) Returning the results for all
states (i.e. all listings) rather than just Ohio. From my tutorial it
looks like I need foreach my $link ($browser->find_aal_links( url_regex =>
SOMETHING)){ - and that the something is based on the url that appears
upon executing a link. But from there I'm stumped. The url of the form is
in the script below. Thanks in advance.
Ken
use strict;
use WWW::Mechanize;
my $output_dir = "c:/training/bc/";
my $starting_url =
"http://www.theblackchurchpage.com/modules.php?name=Locator";
my $browser = WWW::Mechanize->new();
$browser->get( $starting_url );
$browser->form_number( 3 );
$browser->field( "church_state", "OH" );
$browser->submit();
{
open OUT, ">$output_dir/bc7.xls" or die "Can't open file: $!";
print OUT $browser->content;
# close OUT;
}
close PAGE;
print $browser->content;
--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
<http://learn.perl.org/> <http://learn.perl.org/first-response>