Re: [Web-SIG] DOM-based templating

Donovan Preston Fri, 03 Jun 2005 00:45:25 -0700

On Jun 2, 2005, at 11:18 PM, Ian Bicking wrote:

While we're on the topic of DOM-based templating...

FormEncode has a module htmlfill�
(http://formencode.org/docs/htmlfill.html), which is basically like�
DOM-based templating that just knows about HTML forms.� But currently it�
doesn't use a DOM, it uses an HTMLParser subclass.� This makes it much�
more complex than it would otherwise be, and misses out on some�
potential performance gains -- many times the input to htmlfill will be�
output from a template or HTML generator, and so often the DOM from the�
template is serialized to text, then parsed again.

I had thought about moving this to a DOM or DOM-ish thing of some sort,�
but I don't know which one.� Unfortunately many of the options are not�
very humane -- that is, they are "correct", but not user-friendly.�
Here's what I'd like, and maybe someone can suggest something (I won't�
claim HTMLParser is that humane either; but I'm looking to improve�
this).� Here's what I'd like:

We've talked about this slightly before, but I think now more than ever stan can be that DOM. I don't think it would be too much work; it would mostly require removing assumptions that other nevow modules are available. I think stan could be broken out of nevow and into a standalone thing by pulling these modules:

nevow.stan

nevow.tags

nevow.loaders

nevow.context

And the package:

nevow.flat

I'm willing to do the work, and I'm willing to remove assumptions it makes and refactor things until they are clean. The module which would require the most work is nevow.context -- an internal rendering implementation detail that Nevow makes explicit but I would want to hide from non-Nevow users of stan. nevow.context was the first module of nevow to be written, and has a bunch of crufty bad�decisions that haven't yet been refactored out of existence. But I'd like to do it, and this would give me an excuse to.

* Can parse HTML, not just XHTML.� Not the crazy HTML browsers parse,�
but unambiguous well-formed HTML.� I don't like the idea of putting the�
HTML through tidy; that's fine for a screen-scraper, but is way too�
defensive for this kind of thing.

nevow.loaders.htmlfile does a good job of parsing normal html. nevow.loaders.xmlfile parses strict XHTML and allows more tag tricks, but I think casual users won't notice the difference, especially for the purpose you desire.

* Can generate HTML.� This is probably easy to tack onto most systems,�
even if it isn't present now -- it's just a couple rules about how to�
serialize tags.

HTML rather than XHTML? I'm curious what the motivation for this is, and if you know what the couple of rules would be. I think it wouldn't be too hard.

Hmm, I guess the motivation for the previous point is the next point?

* Doesn't modify the output at all for areas where no transformations�
occurred.� It doesn't wipe out whitespace.� It *definitely* doesn't lose�
comments.� It keeps attribute order.� When nodes are modified it's�
sometimes ambiguous how that effects the output, so if attribute order�
is lost there it's not that big a deal.

stan is whitespace in, whitespace out. It keeps comments. It uses a dict for attributes, but this could be changed easily. nevow.url uses a list of tuples, because order is actually important there. This means it needs to have a different API; it has .add() as well as .replace(). Add adds a new key value pair, even if the key is already present; replace finds any existing keys and puts a new value in it's place, preserving the original order.

* Can output nicely-formatted code.� Probably easy to add, but nice if�
it's already there.� This is, of course, entirely contrary to the�
previous item ;)� When generating nodes *purely* from Python, systems�
tend to produce HTML/XML with no extra whitespace at all, and completely�
� unreadable.

This is really, really, really a bad idea. While browsers claim to be whitespace agnostic, they make a huge rendering distinction between "no whitespace present" and "any whitespace present". Nevow preserves any whitespace that was originally in your template, but when generating tags from Python it can't, so it doesn't.

That said, it is something I have considered writing before. Woven had it. I found it to be more trouble than it is worth. I think it should be added, but you should have to go out of your way to turn it on, and it should be off by default.

* Keeps around enough information to produce good error messages.� It�
needs to be possible to figure out the line and maybe column where a�
node was originally defined.� If we're supporting multiple�
transformations by multiple systems, then this information needs to�
persist through the transformations.� I think this is a really important�
and undervalued feature; anyone can write a templating system with�
crappy error messages (and lots and lots of people do).� Good error�
messages set a templating system apart.

A great idea. It would be trivial to add file/line/column information and populate it differently in each of the loaders. I love it, I'm going to go do it right now.

* Reasonably fast.

Nevow was designed as an optimization of woven, and as a result is pretty fast. It has a two-pass system where one pass is taken when the template is initially loaded (once per template per process, assuming the template doesn't change on disk) and non-important nodes are optimized out of what actually happens at render time.

There's also a bunch of low hanging optimization work in nevow.context. When I originally wrote it, I was worried about people mutating things so I made lots of copies. In the meantime, it turns out that the "correct" style of using it is to not mutate things but be somewhat functional and side effect free. Since mutating is still nice for some things, the objects which get mutated get copied before you get called to mutate them. But, a lot, lot more copying currently happens than is necessary.

Yet another thing I have been meaning to do but haven't gotten around to, that this might encourage me to do.

I've played around just a bit with ElementTree, but I only felt so-so�
about it.� I felt like it was pretty correct, but not very humane --�
maybe that'd be good enough if I was processing big XML documents, but�
it doesn't work for HTML templating.

Agreed.

_______________________________________________
Web-SIG mailing list
[email protected]
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com

Re: [Web-SIG] DOM-based templating

Reply via email to