Cool that's nice to know :)

Of course not every xml parser behaves the same, but libxml2 (which is what
php uses for it's dom parsing) isn't the least used parser out there either,
so it's likely to perform decently then.

I've noticed that libxml2 honors the doctype perfectly, and if none is set
it automagically adds a loose.dtd doctype, which I think is probably the
right thing to do anyhow.

Thanks for the info!

   -- Chris

On Sat, Apr 11, 2009 at 6:57 PM, Kevin Brown <e...@google.com> wrote:

> On Sat, Apr 11, 2009 at 4:03 AM, Chris Chabot <chab...@google.com> wrote:
>
> > Hey All,
> >
> > I've committed my big change to the gadget rendering that switches it
> over
> > from a plain old 'output text strings' to a completely DOMDocument based
> > model (ie: the gadget's document is parsed using the dom doc, and all the
> > js
> > and css are injected through dom functions (appendChild/etc)).
> >
> > The upside of this change is that the same logic is re-usable for the
> > proxied content parsing, where the same js and css needed to be injected
> > (even though the remote content can have a proper <html><head><body>
> > structure or not, valid existing js/css or not, etc). So by using the DOM
> > functions, libxml sorts all this out for us and always makes a proper
> html
> > document out of it, also it significantly reduced duplicate code paths
> and
> > the end result is a much cleaner implementation.
> >
> > The downside is that the output is now slightly different then before,
> most
> > noticeably, 'broken' html will come out slightly different then before
> > (because libxml 'fixes' it), which has the potential of breaking gadgets.
> >
> > I've tested my standard set of gadgets (Budy Poke, iLike, OSDA,
> Compliance
> > Test, etc) and they all work fine through the new parser, but if you have
> > any gadgets to test with, I would love to hear how they work on the new
> > renderer. http://www.partuza.nl/ now runs this new implementation, so
> > that's
> > probably the easiest place to test right now.
>
>
> The java code was already doing this, and has been for ~6 months. So far no
> complaints. One thing to watch for is preservation of the doctype, as the
> spec requires it.
>
> We're doing something similar with CSS as well (to facilitate rewriting),
> and that did cause some issues, mostly because we were  using a fairly
> strict CSS parser.
>
> >
> >
> > Thanks!
> >
> >   -- Chris
> >
>

Reply via email to