On Thu, 27 Jul 2000, Jeffrey W. Baker wrote:

> On Thu, 27 Jul 2000, Paul J. Lucas wrote:
> 
> >     http://www.best.com/~pjl/software/html_tree/
> 
> Hey, that's really nice.

        Thanks.  :)  Admitedly, the web site could use more example
        other than what's in the manual pages, but where, oh where, to
        find the time...

> And once the template HTML files are parsed, I can just leave the $root_node
> in memory for future requests, right?

        Yes, I suppose so...  I never thought hard about doing that.
        The reason is that you have to be careful about doing it since
        the Perl/mod_perl API *intends* for you to manipulate/change
        the structure/content, hence what you're left with after a page
        generation is not the same HTML tree you started with.

        The HTML DOM parsing is really fast since that part of the
        software is written natively in C++ with memory-mapped I/O.
        (It's the same parsing engine used in SWISH++, FWIW.)

        HTML Tree does, however, cache the associated Perl (.pm) code
        for a page just like Apache::Registry does (based on timestamp).

> And since the parser isn't validating, I can extend HTML with tags like
> <NODE>, <BANNER>, and <WHATEVER>, right?

        Not really, actually.  The reason is that the HTML parser
        doesn't know anything about non-standard HTML elements so it
        can't know if an element (not "tag") you make up has a required
        or optional end tag to get the parsing ("balancing") right.  If
        this were XML, it would be a different story.

        Currently, "made up" elements are ignored entirely, i.e., *not*
        added to the resultant HTML DOM tree and do not affect
        "balancing" during parsing.

        - Paul

Reply via email to