Since it's native xml format, I would use XML::Simple to parse it into a hash, then you can format however you want by looping through the hash.
On 6/6/06, [EMAIL PROTECTED] <[EMAIL PROTECTED]> wrote:
The script below scrapes a House of Representatives vote page which is in xml and saves it in a spreadsheet which is best opened as an xls read only. How can I: 1) scrape multiple vote pages into individual spreadsheets with a single script? 2) Only scrape columns C, F, G, H in the result here? I'd also prefer to have the spreadsheet as a csv, but that doesn't work by just changing *.xls to *.csv Thanks in advance. Ken #!/bin/perl use strict; use warnings; use WWW::Mechanize; my $output_dir = "c:/training/bc"; my $starting_url = "http://clerk.house.gov/evs/2005/roll667.xml"; my $browser = WWW::Mechanize->new(); $browser->get( $starting_url ); foreach my $line (split(/[\n\r]+/, $browser->content)) { print $line;} open OUT, ">$output_dir/vote667.xls" or die "Can't open file:$!"; foreach my $line (split(/[\n\r]+/, $browser->content)) { print OUT "$line";} close OUT; -- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] <http://learn.perl.org/> <http://learn.perl.org/first-response>
-- Anthony Ettinger Signature: http://chovy.dyndns.org/hcard.html -- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] <http://learn.perl.org/> <http://learn.perl.org/first-response>