On Thu, 27 Jul 2000, Jeffrey W. Baker wrote:
> On Thu, 27 Jul 2000, Paul J. Lucas wrote:
>
> > http://www.best.com/~pjl/software/html_tree/
>
> Hey, that's really nice.
Thanks. :) Admitedly, the web site could use more example
other than what's in the manual pages, but where, oh where, to
find the time...
> And once the template HTML files are parsed, I can just leave the $root_node
> in memory for future requests, right?
Yes, I suppose so... I never thought hard about doing that.
The reason is that you have to be careful about doing it since
the Perl/mod_perl API *intends* for you to manipulate/change
the structure/content, hence what you're left with after a page
generation is not the same HTML tree you started with.
The HTML DOM parsing is really fast since that part of the
software is written natively in C++ with memory-mapped I/O.
(It's the same parsing engine used in SWISH++, FWIW.)
HTML Tree does, however, cache the associated Perl (.pm) code
for a page just like Apache::Registry does (based on timestamp).
> And since the parser isn't validating, I can extend HTML with tags like
> <NODE>, <BANNER>, and <WHATEVER>, right?
Not really, actually. The reason is that the HTML parser
doesn't know anything about non-standard HTML elements so it
can't know if an element (not "tag") you make up has a required
or optional end tag to get the parsing ("balancing") right. If
this were XML, it would be a different story.
Currently, "made up" elements are ignored entirely, i.e., *not*
added to the resultant HTML DOM tree and do not affect
"balancing" during parsing.
- Paul