Sylvain Wallez wrote:
Carsten Ziegeler wrote:Which will never happen due to time constraints ...(just a joke).
While debugging/profiling a very big application for ourcustomer I found out that the current implementation of the
TraxTransformer is slowing down caching!
Why? Well, the caching algorithm asks every sitemap component ifthe cached content is still valid. The TraxTransformer answers
this question by looking if the stylesheet has changed since the
last use (time stamp comparison).
So far so good, but you can have imports/includes in your xslt,so the TraxTransformer checks them as well - and this is done by
"parsing" the xslt and looking at all includes/imports. This
parsing is done, even when the content is fetched out of the cache.
Due to this mechanism, each stylesheet is parsed on everyrequest (if cached content is used or not) which is in most cases
unnecessary. As we didn't use the "use-store" parameter of the
xslt transformer this is a real performance problem!
Is the reparsing always occuring, even if when "use-store" is "true"? I guess not.
This was discussed a while ago, and we have here the combination of two bug/deficiencies:
use-store --------- why in hell is use-store to false??? IIRC, it was fist set to true because the transient store was actually not transient and tried to serialize compiled XSLTs in the persistant store, which failed because these objects are not serializable.
Let's switch it to true an ensure the transient store is really transient.
XSLTProcessor ------------- This component's design is intrinsically bad from a cache perspective: the only way to access validity is through getTransformerHandlerAndValidity which always creates the TransformerHandler even if we don't use it. Combine this with use-store=false, and we end up reparsing the XSL at each call.
As the parsing is very time consuming, delivering a cachedcontent is still "slow". We had figures, where a cached content
took 1.5 sec (and producing it from scratch took 1.8 sec).
With the recent changes we are down below 100ms for deliveringthe cached content! I added a "check-includes" configuration to
the TraxTransformer. If you set it to "false" imported stylesheet
are not checked for changes for the caching, but you really feel
the performance difference.
So, you loose a little bit comfort but gain a lot of performanceimprovements. And if you use it only for production, it shouldn't
be a problem anyway. (The default is "as-is")
This way of solving the problem is hacky as it forces to choose between speed and auto-reload and will often lead people to either not understand why their changes are not taken into account or lead them to choose the "secure" way by setting auto-releoad to false.
We must refactor the XSLTProcessor so that: - it returns a MultiSourceValidity if needed (see in o.a.c.c.source.impl in scratchpad). - getting the validity in the transient store is clearly separated from getting the TransformerHandler.
PS: The new feature will be released with 2.1.3 in approx. two weeks.
-1. If you need the optimisation quickly for your customer, please make a different class or keep it private until we do the clean refactoring.
You know what? I started this refactoring on my HD at the time this problem was raised, but never had the time to finish it...
Anyway, I agree that refactoring the XSLTProcessor is a way to go and that with useStore the problem is not that important.
BUT, then even if the content is fetched from the cache, the XSLT Processor is "activated" for the stylesheet, which is imho a total overkill for finding out that the main stylesheet has changed; so I still think this option is very very useful and doesn't do any harm.
Sorry, I don't understand "activated". Do you mean that a TransformerHandler is created? That's exactly the design flaw I pointed out.
Just make some speed comparision with and without the flag and see if it helps you as well.
I'm more than sure that it helps! But I wouldn't like a temporary quick'n'dirty workaround go into a release...
Sylvain
-- Sylvain Wallez Anyware Technologies http://www.apache.org/~sylvain http://www.anyware-tech.com { XML, Java, Cocoon, OpenSource }*{ Training, Consulting, Projects } Orixo, the opensource XML business alliance - http://www.orixo.com