Carsten Ziegeler wrote:

Sylvain Wallez wrote:


Carsten Ziegeler wrote:



While debugging/profiling a very big application for our


customer I found out that the current implementation of the
TraxTransformer is slowing down caching!


Why? Well, the caching algorithm asks every sitemap component if


the cached content is still valid. The TraxTransformer answers
this question by looking if the stylesheet has changed since the
last use (time stamp comparison).


So far so good, but you can have imports/includes in your xslt,


so the TraxTransformer checks them as well - and this is done by
"parsing" the xslt and looking at all includes/imports. This
parsing is done, even when the content is fetched out of the cache.


Due to this mechanism, each stylesheet is parsed on every


request (if cached content is used or not) which is in most cases
unnecessary. As we didn't use the "use-store" parameter of the
xslt transformer this is a real performance problem!




Is the reparsing always occuring, even if when "use-store" is "true"? I
guess not.

This was discussed a while ago, and we have here the combination of two
bug/deficiencies:

use-store
---------
why in hell is use-store to false??? IIRC, it was fist set to true
because the transient store was actually not transient and tried to
serialize compiled XSLTs in the persistant store, which failed because
these objects are not serializable.

Let's switch it to true an ensure the transient store is really transient.

XSLTProcessor
-------------
This component's design is intrinsically bad from a cache perspective:
the only way to access validity is through
getTransformerHandlerAndValidity which always creates the
TransformerHandler even if we don't use it. Combine this with
use-store=false, and we end up reparsing the XSL at each call.



As the parsing is very time consuming, delivering a cached


content is still "slow". We had figures, where a cached content
took 1.5 sec (and producing it from scratch took 1.8 sec).


With the recent changes we are down below 100ms for delivering


the cached content! I added a "check-includes" configuration to
the TraxTransformer. If you set it to "false" imported stylesheet
are not checked for changes for the caching, but you really feel
the performance difference.


So, you loose a little bit comfort but gain a lot of performance


improvements. And if you use it only for production, it shouldn't
be a problem anyway. (The default is "as-is")




This way of solving the problem is hacky as it forces to choose between
speed and auto-reload and will often lead people to either not
understand why their changes are not taken into account or lead them to
choose the "secure" way by setting auto-releoad to false.

We must refactor the XSLTProcessor so that:
- it returns a MultiSourceValidity if needed (see in o.a.c.c.source.impl
in scratchpad).
- getting the validity in the transient store is clearly separated from
getting the TransformerHandler.



PS: The new feature will be released with 2.1.3 in approx. two weeks.




-1.
If you need the optimisation quickly for your customer, please make a
different class or keep it private until we do the clean refactoring.



Which will never happen due to time constraints ...(just a joke).



You know what? I started this refactoring on my HD at the time this problem was raised, but never had the time to finish it...


Anyway, I agree that refactoring the XSLTProcessor is a way to go and that with useStore the problem is not that important.

BUT, then even if the content is fetched from the cache, the XSLT Processor is "activated" for the stylesheet, which is imho a total overkill for finding out that the main stylesheet has changed; so I still think this option is very very useful and doesn't do any harm.



Sorry, I don't understand "activated". Do you mean that a TransformerHandler is created? That's exactly the design flaw I pointed out.


Just make some speed comparision with and without the flag and see if it helps you as well.



I'm more than sure that it helps! But I wouldn't like a temporary quick'n'dirty workaround go into a release...


Sylvain

--
Sylvain Wallez                                  Anyware Technologies
http://www.apache.org/~sylvain           http://www.anyware-tech.com
{ XML, Java, Cocoon, OpenSource }*{ Training, Consulting, Projects }
Orixo, the opensource XML business alliance  -  http://www.orixo.com




Reply via email to