> -----Original Message----- > From: Christian Wattengård [mailto:[EMAIL PROTECTED] > Sent: Wednesday, February 18, 2004 4:41 AM > To: [EMAIL PROTECTED] > Subject: Extracting data from html structure. > > > I have the following html structure: > -------------------------------------------------------------- > ----------
[long HTML snipped] > -------------------------------------------------------------- > ----------- > And I want to extract from it the chekbox values and their respective > channel names (contained in the link beside the checkbox). > I have checked a lot of modules on cpan but I haven't found > one that does it > just the way I want it to yet. Actually I havent found any > that I can get to > work at all. > > Any tips? > > Christian... > I snipped the HTML you provided cause it was sooooo long. Try and trim it down next time. Anyhow, I think the code below does what you want. use strict; use warnings; use HTML::Parser; my $HTML = <<EOF; <table border=0 cellpadding=0 cellspacing=0 width=156> <tr> <td colspan=2 bgcolor=#CDC9C0><b><font face=verdana,arial,helvetica,sans-serif size=-2 color=#666666> Norske</font></b></td> </tr> <tr> <td width=78 valign=top><font class=link-00-ul-l size=1> <input type="checkbox" name=kanal_id[] value=1 CHECKED> <a href="index.html?kanal_id=1&dag=0&fra_tid=0&til_tid=24&kategori_id=">NRK 1</a><br> <input type="checkbox" name=kanal_id[] value=3 > <a href="index.html?kanal_id=3&dag=0&fra_tid=0&til_tid=24&kategori_id=">TV 2</a><br> <input type="checkbox" name=kanal_id[] value=5 > <a href="index.html?kanal_id=5&dag=0&fra_tid=0&til_tid=24&kategori_id=">TVNorge </a><br> </font></td> </tr> </table> EOF my $current_tag; # i'm not happy with using this. # is there a better way? anyone? my $p = HTML::Parser->new( api_version => 3, start_h => [ \&start_tag, 'tagname,attr' ], text_h => [ \&text, 'text' ] ); $p->parse($HTML); $p->eof; sub start_tag { my $name = shift; my $attrs = shift; my $text = shift; $current_tag = $name; if ($name eq 'input' and $attrs->{'type'} eq 'checkbox') { print $attrs->{'value'}, "="; } } sub text { my $text = shift; if ($current_tag eq 'a') { print "$text\n"; } } -- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] <http://learn.perl.org/> <http://learn.perl.org/first-response>