While we're on the topic of DOM-based templating... FormEncode has a module htmlfill (http://formencode.org/docs/htmlfill.html), which is basically like DOM-based templating that just knows about HTML forms. But currently it doesn't use a DOM, it uses an HTMLParser subclass. This makes it much more complex than it would otherwise be, and misses out on some potential performance gains -- many times the input to htmlfill will be output from a template or HTML generator, and so often the DOM from the template is serialized to text, then parsed again.
I had thought about moving this to a DOM or DOM-ish thing of some sort, but I don't know which one. Unfortunately many of the options are not very humane -- that is, they are "correct", but not user-friendly. Here's what I'd like, and maybe someone can suggest something (I won't claim HTMLParser is that humane either; but I'm looking to improve this). Here's what I'd like: * Can parse HTML, not just XHTML. Not the crazy HTML browsers parse, but unambiguous well-formed HTML. I don't like the idea of putting the HTML through tidy; that's fine for a screen-scraper, but is way too defensive for this kind of thing. * Can generate HTML. This is probably easy to tack onto most systems, even if it isn't present now -- it's just a couple rules about how to serialize tags. * Doesn't modify the output at all for areas where no transformations occurred. It doesn't wipe out whitespace. It *definitely* doesn't lose comments. It keeps attribute order. When nodes are modified it's sometimes ambiguous how that effects the output, so if attribute order is lost there it's not that big a deal. * Can output nicely-formatted code. Probably easy to add, but nice if it's already there. This is, of course, entirely contrary to the previous item ;) When generating nodes *purely* from Python, systems tend to produce HTML/XML with no extra whitespace at all, and completely unreadable. * Keeps around enough information to produce good error messages. It needs to be possible to figure out the line and maybe column where a node was originally defined. If we're supporting multiple transformations by multiple systems, then this information needs to persist through the transformations. I think this is a really important and undervalued feature; anyone can write a templating system with crappy error messages (and lots and lots of people do). Good error messages set a templating system apart. * Reasonably fast. I've played around just a bit with ElementTree, but I only felt so-so about it. I felt like it was pretty correct, but not very humane -- maybe that'd be good enough if I was processing big XML documents, but it doesn't work for HTML templating. -- Ian Bicking / [EMAIL PROTECTED] / http://blog.ianbicking.org _______________________________________________ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com