Re: New HTML::TokeParser Interface

Sean M. Burke Sun, 03 Feb 2002 00:02:45 -0800

At 23:28 2002-02-01 -0800, Gisle Aas wrote:
>[...]
>I think blessing of the tokens might have merit.  I also think that
>HTML::TokeParser (and HTML::PullParser) should have some kind of
>support for this.
>[...]


I'm doing something like that for a HTML::TokeParser-like interface to my
new Pod parser system.  It's very lightweight, basically just accessors for
the items in each blessed arrayref, so the user never has to know indexes
for anything.  I don't see much point in doing anything more ambitious.

Applying it to HTML::TokeParser would basically mean something like this:


For tokens (as gotten from get_token):

["S",  $tag, $attr, $attrseq, $text]
$x->type gets 'S'
$x->tag gets $tag
$x->attr('name') gets you $attr->{'attr')
$x->attr_hashref() gets you $attr
$x->attr_order gets you  @$attrseq
$x->source gets you $text

["E",  $tag, $text]
$x->type gets 'E'
$x->tag gets $tag
$x->source gets you $text

["T",  $text, $is_data]
$x->type gets 'T'
$x->text gets you $text
$x->text_decoded gets you
      $is_data ? $text : HTML::Entities::decode_entities($text)
$x->cdata gets you $is_data
$x->decodeable gets you !$is_data
$x->source gets you ... what?

["C",  $text]
$x->type gets 'C'
$x->text gets you $text
$x->source gets you ... what?

["D",  $text]
$x->type gets 'D'
$x->text gets you $text
$x->source gets you ... what?

["PI", $token0, $text]
$x->type gets 'PI'
$x->target gets you $token0
$x->text gets you $text
$x->source gets you ... what?



For tags (as gotten from get_tag):

[$tag, $attr, $attrseq, $text]
$x->type gets 'S'
$x->tag gets $tag
$x->attr('name') gets you $attr->{'attr')
$x->attr_hashref() gets you $attr
$x->attr_order gets you  @$attrseq
$x->source gets you $text

["/$tag", $text]
$x->type gets 'E'
$x->tag gets $tag  (without the "/")
$x->source gets you $text


--
Sean M. Burke    [EMAIL PROTECTED]    http://www.spinn.net/~sburke/

Re: New HTML::TokeParser Interface

Reply via email to