Hi Bianca,
I had a look at the HTML and your code, I dumped the resulting
structure from HTML::TableContentParser's parse method (see below).
Shooting from the hip, it seems like HTML::TableContentParser, does
not support the structure in the page you want to parse.
You might have to look into something like HTML::Parser or
HTML::TokeParser
I have included something I whipped up based on HTML::TokeParser, let
me know if you have questions.
jonasbn
$VAR1 = [
{
'headers' => [
{
'data' => 'Bug #38516'
},
{
'data' => 'Submitted:'
},
{
'data' => 'Modified:'
},
{
'data' => 'Reporter:'
},
{
'data' => 'Status:'
},
{
'data' => 'Category:'
},
{
'data' => 'Severity:'
},
{
'data' => 'Version:'
},
{
'data' => 'OS:'
},
{
'data' => 'Assigned to:'
},
{
'data' => 'Target Version:'
},
{
'data' => 'Tags:'
},
{
'data' => 'Triage:'
}
],
'style' => 'width: 100%',
'id' => 'bugheader',
'rows' => [
{
'data' => '
',
'id' => 'title'
},
{
'data' => '
'
},
{
'data' => '
'
},
{
'data' => '
'
},
{
'data' => '
'
},
{
'data' => '
'
},
{
'data' => '
'
},
{
'data' => '
'
},
{
'data' => '
'
}
]
}
];
#
#!/usr/bin/perl
use strict;
use warnings;
use LWP::UserAgent;
use HTML::TokeParser;
my $URL = 'http://bugs.mysql.com/bug.php?id=38516';
get_tables($URL);
exit(0);
sub get_tables {
my $URL = shift;
my $ua = LWP::UserAgent->new();
my $response = $ua->get($URL);
my $page;
if ($response->is_success) {
$page = $response->content; # or whatever
}
else {
die $response->status_line;
}
my $p = HTML::TokeParser->new( \$page );
my $i = 0;
while ($p->get_tag("th", "td")) {
my $tag = $p->get_text();
if ($i%2) {
print "$tag\n";
} else {
print "$tag\t";
}
$i++;
}
}
On 14/09/2008, at 08.41, Bianca Shibuya wrote:
Hi there,
Anybody can help me in this?
I have this piece of code:
use Text::CSV;
use Date::Manip qw(ParseDate UnixDate);
use LWP::Simple;
use URI;
use HTML::TableContentParser;
use HTML::Entities;
sub get_tables {
my $URL = shift;
my $page = get($URL);
die "Couldn't get $URL" unless defined $page;
my $tcp = HTML::TableContentParser->new();
return $tcp->parse($page);
}
my $URL = 'http://bugs.mysql.com/bug.php?id=38516';
my $tables = get_tables($URL); #it returns a reference for an array
for $t (@$tables) {
for $r (@{$t->{rows}}) {
print "Row: ";
for $c (@{$r->{cells}}) {
print "[$c->{data}] ";
}
print "\n";
}
}
It prints "Row: " 9 times, without any data.
Thank you.
Bianca
Novos endereços, o Yahoo! que você conhece. Crie um email novo
com a sua cara @ymail.com ou @rocketmail.com.
http://br.new.mail.yahoo.com/addresses