"cpaul" <[EMAIL PROTECTED]> writes:

> >  use HTML::PullParser;
> >  $p = HTML::PullParser->new(file  => "index.html",
> >                             start => "event, tag",
> >                             end   => "event, tag",
> >                             ignore_elements => [qw(script style)],
> >                            ) || die "Can't open: $!";
> > 
> >  while (my $token = $p->get_token) {
> >      #...do something with $token
> >  }
> 
> 
> if i may ask, what is "event, tag" 

It is the 'argspec' stuff you find documented in HTML::Parser.  The
'tag' should actually be 'tagname'.  The 'event' will be passed out as
'start' or 'end' in this case.

A complete example:
---------------------------------------------------------------------
use HTML::PullParser;
use Data::Dump qw(dump);

$doc = <<'EOT';
<TITLE>Foo</TITLE>
<script>
<foo>
</script>

<h1>Hi</h1>
EOT

my $p = HTML::PullParser->new(doc => $doc,
                              start => 'event,tagname',
                              end   => 'event,tagname',
                              ignore_elements => ['script'],
                             );

while (my $t = $p->get_token) {
    print dump($t), "\n";
}
---------------------------------------------------------------------
This will produce:

["start", "title"]
["end", "title"]
["start", "h1"]
["end", "h1"]

Regards,
Gisle

Reply via email to