Re: The Guardian website moves to Velocity

Jonathan Revusky Fri, 11 May 2007 14:25:22 -0700

Robert Koberg wrote:

On Fri, 2007-05-11 at 21:06 +0200, Jonathan Revusky wrote:
Robert Koberg wrote:
http://freemarker.sourceforge.net/docs/xgui_declarative_basics.html
First, you use a W3 DOM which is very expensive/inefficient. XSL
processors create a processor optimized DOM. For example saxon creates
something like a List of SAX events.
Back in ancient times (early 2003) we used to use Saxon to generate theFreeMarker docs. That was using the XSLT docbook project stylesheetsfrom Norman Walsh to generate the cooked HTML. (The FM docs aremaintained in a canonical docbook XML format.)
When FM's XML processing became mature enough, I decided to rewrite ourdocgen stuff using FreeMarker itself. The results were quite amazing,and I reported them here.
http://www.theserverside.com/news/thread.tss?thread_id=20015
The FreeMarker docs generation task was about 15x faster than the oneusing Saxon/XSLT. I honestly don't know what the bottleneck in the XSLTstuff was, but these were XSLT stylesheets written as part of thedocbook project, by a guy who, in principle, really knows this stuff. Idon't recall what memory usage was. I think I looked, but didn't care somuch about it, as long as I had enough memory. But IIRC, memory usagewas also much higher using the XSLT stuff.
Oooh I just hate XSL bashing :) Apologies to people annoyed by this
thread, but... The Docbook XSL library is *huge* (trying to figure it
out is how I learned XSL in the late 90s). Just creating the
transformer, before you even get to the transformation, is what is most
likely taking the time. If, however, you had a cached Templates object
and derived your Transformer from that, then I bet XSL would blow your
transformation away, even with the huge docbook XSL library.

Why do you think this? You have, by your own admission, never even usedFreeMarker. Why are you so sure XSL is actually faster? (I can see whereyou doubt the 15x result. I was very surprised by it myself, but why doyou think XSL is faster?)

I think other people's experience -- doing completely different things-- has been that FM is faster. Quite a lot faster. I remember seeing athread about this a while back. But I can't find it now, googling, Ihave to confess.

So, if you
were not aware of this type of thing and for each page transform you
were always reloading the entire docbook library, no wonder it was so
much slower. I also bet you did not implement the entirety of the
Docbook XSL in your code. Even with that, I bet a compiled and cached
Templates derived Transformer would beat FM. Anyway, it is very rare
that someone needs or can understand 'full-on' docbook...

That's only one case. We haven't done extensive benchmarks or anything,but I remember seeing other people commenting that they were findingFreeMarker *much* faster.
bah. Let's see, comparing something your wrote (the transformation
logic) with your specific needs to something written with a huge generic
scope, hmmm....

The FTL code that handles the transformation does not handle all theelements in the docbook vocabulary, because we don't use all theelements in the docbook vocabulary. However, if it did, it would not beappreciably slower, AFAICS.

Second, XSL v1 is pretty mature and
the processors are very good. There are processors for Java, .NET, C,
Eiffel, and probably others that can use the exact same XSL.
Okay, that's true enough.
But the idea that the XSLT processors are faster and more efficient isnot likely to be true, I don't think. The limited data I have (and Ihave no reason to BS you on this) suggest that the truth is 180º awayfrom this. FM's XML processing is much faster.
Well, I would say it shows more that you know how to run FM faster than
you know how to run XSL fast.

No particular attempt was made to optimize the FM code. OTOH, you'reright, I suppose, that I don't know how to run XSL fast. Apparently mostother people don't because just googling around, you have a lot ofcomments about XML transformations being rather dog-slow.

But basically, no particular attempt was made to optimize either the XSLor the FM. And that's the result. The FM transformation was 15x faster.I don't know exactly why. If you actually got in there and figured outwhy, then you could tell me, and I'd be interested.

Or maybe the XSLT implementations improved a lot in the last 4 years...I dunno... surely not *that* much...
Of course they have, both v1 and v2.


What? Things like Saxon are 15x faster than they were 4 years ago? REally?

My guess is that our project'sresults on docs generation, with FM running 15x faster, that's sooverwhelming that none of the various XSLT implementations in *any*language are likely to be as fast as FM on Java. Even something directlycoded in C is IMO, not likely to be more than... I dunno... 3x faster,which would still leave FM 5x faster. (For that specific benchmark, Igrant. Maybe it's not typical.)
this is just a silly comparison. I would have thought someone as pure
and honest as you would not make such ridiculous claims.

Pardon my language, but I'm really getting fucking tired of this kind ofbullshit. I said totally clearly with all the appropriate qualificationsthat this was just our experience of it and the benchmark might not beapplicable generally. I never claimed that FM was 15x faster in general.I simply said this was our experience.

Bobby boy, I am going to put my money where my mouth is. You can checkout our docgen module and run it. If you can rewrite that XMLtransformation in XSLT and get it to run faster than the FM one we have,using any Java XSLT implementation, I will wire you $500. Or I'll donatethe $500 to the charity of your choice, like maybe the W3C fan club.


And better yet, I will eat my words in public.

I do not believe you can do that. I think you're full of shit and I'mnow officially getting tired of this conversation.

The FTL that transforms the docs is, IIRC, about 400 lines. Rewrite itin XSLT and get it to run faste. An easy 500 bucks, go for it, dude.


Jonathan Revusky
--
lead developer, FreeMarker project, http://freemarker.org/

My same XSL
that usually ran on the server can now effectively run on the client
(for a 'preview' type of thing) for reduced server load. Third, I have
come to rely on a great deal of the W3 XSL spec, I would hate to switch
to something does not implement it all.
Well, yeah, but, of course, FM doesn't implement the W3 XSL spec. Butisn't that like talking about whether a Pascal compiler implements theANSI C standard? It's a different language...
yes -FM needs a specific environment - java (and I doubt it will ever
run in the browser)
Fourth, I have not gone up to
XSL v2 yet (mainely because of browser rendering), but it wouold be
relatively painless. Whereas switching to FreeMarker would require *A
LOT* of work and most likely a lot of feature requests... anyhoo, don't
want to get off-topic :)
Likely, you could replace your:

XML->XSLT->VTL->final output

to just:

XML->FTL->final output.
And that truly would be simpler, I'd say. Of course, I told you thisbefore, and you weren't interested for whatever reason.
And I told you before, we pregenerate the pages. Think of it like a cache, 
kinda.
Well, the scheme you describe may have some advantages, probably does.But simplicity is not, AFAICS, one of them. (We were talking about theadvantages of simplicity, weren't we?) I mean, somebody new coming in,trying to get their arms around your system, how it all works, for themto understand how the XML gets turned into the HTML that somebody sees,they have to understand a two-step thing involving two completelydifferent tools, right?
Well, for the developer who creates the XSL (who is usually an XSL
expert) it is probably only a bit difficult to create output that will
be dynamic at runtime (that is why a simple templating language is a
good thing here).
But, the real end user of our CMS just has to create some simple XML in
a WYSIWYG editor. For example, for an online poll they need to provide
the question and a list of answers. The XSL creates the runtime code
that, depending on the req, either shows them the quiz and allows
submission or shows them the current results, whatever is needed. So the
author can fiddle away, wordsmith the quiz, add/remove answers, etc and
just press generate when they need to qa it and push a button to promote
through different staging environs and then onto production.

-Rob



---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: The Guardian website moves to Velocity

Reply via email to