Hi,

If the following belongs on another mailing list, please let me know.

I'm the author of HTML::TokeParser::Simple.
http://search.cpan.org/author/OVID/HTML-TokeParser-Simple-1.3/

Someone recently contact me about a problem which the test case below
illustrates:

  use HTML::TokeParser::Simple;
  my $p  = HTML::TokeParser::Simple->new( $somefile );

  while ( my $t = $p->get_token ) {
    print $t->get_trimmed_text('/font');
  }

The problem is that they are trying to call a parser method on a token.  The
problem goes away if "print $t->..." is changed to "print $p->..."

What this means is that I have a design error which is confusing
programmers.  When either get_token() or get_tag() is called, the resulting
arrayref is blessed into the HTML::TokeParser::Simple class which inherits
from HTML::TokeParser.  This means tokens improperly inherit parser methods.
I was so close to the module that I didn't see how this would be confusing.

My thoughts to fix this:  I create a separate class,
HTML::TokeParser::Simple::Token and bless the tokens into that class, which
doesn't inherit anything (though perhaps the namespace is confusing?).
Further, I add an AUTOLOAD method to warn programmers when they're trying to
call a parser method on a token (checking whether
HTML::TokeParser->can($method)).  I can also include a "production" switch
to disable AUTOLOAD for production, thereby eliminating the performance hit
(parsing HTML is slow enough as is).

Does this sound like a reasonable strategy to folks?  Should I skip the
AUTOLOAD idea and trust programmers to use the module correctly ... assuming
that I document what is going on?  Also, is there anyone out there using
this module who would like to see added functionality?  It's my most popular
module on the CPAN and I'd like to make it more so.

Cheers,
Ovid

Reply via email to