"cpaul" <[EMAIL PROTECTED]> writes:
> > use HTML::PullParser;
> > $p = HTML::PullParser->new(file => "index.html",
> > start => "event, tag",
> > end => "event, tag",
> > ignore_elements => [qw(script style)],
> > ) || die "Can't open: $!";
> >
> > while (my $token = $p->get_token) {
> > #...do something with $token
> > }
>
>
> if i may ask, what is "event, tag"
It is the 'argspec' stuff you find documented in HTML::Parser. The
'tag' should actually be 'tagname'. The 'event' will be passed out as
'start' or 'end' in this case.
A complete example:
---------------------------------------------------------------------
use HTML::PullParser;
use Data::Dump qw(dump);
$doc = <<'EOT';
<TITLE>Foo</TITLE>
<script>
<foo>
</script>
<h1>Hi</h1>
EOT
my $p = HTML::PullParser->new(doc => $doc,
start => 'event,tagname',
end => 'event,tagname',
ignore_elements => ['script'],
);
while (my $t = $p->get_token) {
print dump($t), "\n";
}
---------------------------------------------------------------------
This will produce:
["start", "title"]
["end", "title"]
["start", "h1"]
["end", "h1"]
Regards,
Gisle