I finally got around to finishing HTML::TokeParser::Simple (was:
HTML::TokeParser::Easy) and released it to the CPAN:
http://theoryx5.uwinnipeg.ca/mod_perl/cpan-search?search=HTML%3A%3ATokeParse
r%3A%3ASimple

This is a subclass of HTML::TokeParser that blesses the returned tokens so
you can call methods on them.  The original tokens are unchanged, so you
should be able to use this as a drop in replacement.  Basically, you have
convenient methods instead of memorizing array references.  You can do this:

    $token->is_start_tag( 'form' )

Instead of

    $token->[0] eq 'S' and $token->[1] eq 'form'

A pathetic, but easy to read, HTML to Text converter:

    while ( my $token = $parser->get_token ) {
        next if ! $parser->is_text( $token );
        print $parser->return_text( $token );
    }

Printing all comments:

    while ( my $token = $p->get_token ) {
        next if ! $token->is_comment;
        print PHB $token->return_text, "\n";
    }

You get the idea.  There are a couple of goofs in the POD (white noise,
basically, no errors that I am aware of), but the tests are fairly solid.
You can use both get_token or get_tag, just be sure to read the POD for a
couple of caveats.

--
Cheers,
Curtis Poe
Senior Programmer
ONSITE! Technology, Inc.
www.onsitetech.com
503-233-1418

Taking e-Business and Internet Technology To The Extreme!

Reply via email to