Let's face it: we are slow. Not painfully slow, not even very slow: actually we are pretty fast for being a server-side Java application and we are such great programmers that we are as fast as the environment on which we are hosted can be fast :-)
Yet I'm afraid we are still slammed in the face by other technologies: static pages, Apache modules, PHP and so on. We have to change this somehow, and I think that there is a solution at least for what is more important to users: perceived performance. I had great results in the past using servlets and reverse proxies in front of them. This can be useful since it at least optimizes bandwith and network communications, thus resulting in a performance boost. But in order to improve the result I had to manually tweak the response headers. Cocoon has a limited capability of doing so: the only place where I could find something was the HttpHeaderAction. From there I can do (almost) whatever I want but this also would mean that every pipeline entry would have to start with an action: difficult to maintain to say the least. What I'm thinking about is a sort of mod_expires functionality clone: mod_expires (http://httpd.apache.org/docs/mod/mod_expires.html) has a simple syntax that allows to implement a flexible caching header handling. Basically what it can do is set the Expires headers based on modifiers acting on the access time (when the resource was requested on a one to one basis) or on the modification time (one to many approach). This is an example taken from the docs: ExpiresByType text/html "access plus 1 month 15 days 2 hours" ExpiresByType image/gif "modification plus 5 hours 3 minutes" This is incredibly powerful when applied to real life scenarios, and I'd really love to see this feature in Cocoon. But what are the best semantics for it? The first thing that comes to my mind is that the directive should be an attribute, not an element. If we also consider that it might be hard to get and calculate a modification time for resources that make up a pipeline, I wouldn't bother about it and base all the system on the access time or on absolute time, using a plain syntax like "2h5m" or "25feb2002" for expressing the expires validity together with some keywords like "never" or "always". Now let's think about the placement of such element: I'll start by stating that such directives can be thought as being at the same level as "handle-errors" is, so they might appear in pipeline declaration. Let's assume that we have three kind of resources: 1. static files that basically don't change: <match pattern="static/**"> <read src="static-files/{1}"/> </match> 2. dynamic built resources that last for a while and change at a fixed date: <match pattern="catalog/**/*.html"> <generate src="catalog/{1}/{2}.xsp"/> <transform src="stylesheets/catalog.xsl" /> <serialize/> </match> 3. dynamic resources that change often: <match pattern="press-releases/**/*.html"> <generate src="xmldb:xindice///PR/{1}/{2}.xml"/> <transform src="stylesheets/articles.xsl" /> <serialize/> </match> 4. dynamic resources that change everytime: <match pattern="myportal/**.html"> <aggregate> ... </aggregate> <transform src="stylesheets/articles.xsl" /> <serialize/> </match> Now let's try to assemble them with two possible syntaxes: 1. different pipelines: <!-- This one expires one year from now --> <pipeline expires="1y"> <match pattern="static/**"> <read src="static-files/{1}"/> </match> </pipeline> <!-- This one expires at the end of the month. --> <!-- Will need to be changed afterwards --> <pipeline expires="31jan2002"> <match pattern="catalog/**/*.html"> <generate src="catalog/{1}/{2}.xsp"/> <transform src="stylesheets/catalog.xsl" /> <serialize/> </match> </pipeline> <!-- This one expires 6hr after the first access --> <pipeline expires="6h"> <match pattern="press-releases/**/*.html"> <generate src="xmldb:xindice///PR/{1}/{2}.xml"/> <transform src="stylesheets/articles.xsl" /> <serialize/> </match> </pipeline> <!-- Finally, this one always expires --> <pipeline expires="always"> <match pattern="myportal/**.html"> <aggregate> ... </aggregate> <transform src="stylesheets/articles.xsl" /> <serialize/> </match> </pipeline> 2. more granular: defined at the "pipeline" level but overridable: <pipeline expires="6h"> <match pattern="static/**" expires="1y"> <read src="static-files/{1}"/> </match> <match pattern="catalog/**/*.html" expires="31jan2001"> <generate src="catalog/{1}/{2}.xsp"/> <transform src="stylesheets/catalog.xsl" /> <serialize/> </match> <match pattern="press-releases/**/*.html"> <generate src="xmldb:xindice///PR/{1}/{2}.xml"/> <transform src="stylesheets/articles.xsl" /> <serialize/> </match> <match pattern="myportal/**.html" expires="always"> <aggregate> ... </aggregate> <transform src="stylesheets/articles.xsl" /> <serialize/> </match> </pipeline> I would say that syntax #1 is more consistent with the actual setup, but feedback is really appreciated. Implementation should be pretty trivial: it would be just a matter of understanding the configuration and setting a couple of headers. Yet this would give us a tremendous performance boost, especially for self-contained webapps where we might put our resources and read them without worrying about performance issues: a reverse proxy will do all the dirty job for us. I eagerly wait for your feedback. Ciao, -- Gianugo --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, email: [EMAIL PROTECTED]