Grzegorz Kossakowski wrote:
Hello,

I would like to discuss making ServletConnection cache-aware. Daniel suggested earlier to utilize standard HTTP protocol concepts. I totally agree with his opinion and would like to propose solution, but first let's discuss requirements.

Requirements
============

Requirements that ServletConnection must meet are really simple:
1. ServletConnection should provide data that can be used for constructing small validation object. 2. ServletConnection should expose functionality for checking if previous response is still valid taking as input validation object only. 3. We would like ServletConnection to make as few as possible round trips in every situation it encounters.

To satisfy these requirements I propose to use concept of HTTP conditional gets[1], more precisely If-Modified-Since request-header[2] field. This way we have following cases: * ServletConnection does not have information needed to create If-Modified-Since header, but response includes Last-Modified header and full content. Validity object can be created. * ServletConnection does not have information needed to create If-Modified-Since header and response does not include Last-Modified header but includes full content. Validity object cannot be created. * ServletConnection does have information needed to create If-Modified-Since. Resource has not been modified so 302 status code is returned as response and response does not include full content. Thus ServletConnection can just tell that content is still valid and can be fetched from cache. * ServletConnection does have information needed to create If-Modified-Since. Resource has been modified so 200 status code is returned as response and response includes full content. ServletConnection tells that cached content is invalid and returns fresh content.

Requirements are satisfied:
1. Last-Modified header can be used to construct validation object.
2. Taking date from validation object enables ServletConnection to formulate conditional GET and then response HTTP code settles if resource is still valid.
3. In every case we have only one round trip.
I agree with everything this far, it would also be nice to add ETag handling to it. The idea is that the servlet-service-fw should work with all kinds of servlets. And using Last-Modified and ETag headers are the two main ways to handle caching for HTTP, so by supporting both we make caching work for a larger share of the servlets. But the first priority is of course to make it work with the SitemapServlet.

Using ETags together with If-None-Match is analogous to use Last-Modified together with If-Modified-Since as you described above. Some extra care is needed if the servlet called from the ServletConnection returns both an ETag and a Last-Modified header.

Implementation proposal
=======================

We should start from making pipelines more HTTP-compliant. This demands taking If-Modified-Since headers into account and returning appropriate status code when caching pipeline is processed. Behavior of non-caching pipelines should not change.
Agree. There is some getLastModified info on the cachedResponse object in the AbstractCachingProcessingPipeline. It doesn't seem like it is used for setting the Last-Modified header or used together with the If-Modified-Since header however.

Also it might be that one could use the SourceValidity object (or maybe a hash key based on it) as an ETag.
Then we should implement setIfModifiedSince and getIfModifiedSince from java.net.URLConnection and construct requests according to value of that property. Also getResponseCode method should be implemented.

All changes proposed above will enable us to implement source validation of ServletSource very easily.

Comments? Thoughts?
Seem like the right direction to me.
I can start implementing this as soon as we came with agreement on this. However, I would like to point out that I'll need some support to make changes in pipeline stuff. I've taken a look on code and not everything seems to be clear. Any volunteer on the board? ;-)
I can't say that the pipeline code is entirely clear to me either. Maybe some of the original authors are still around?

Last remark. I know that my English is quite poor and it could be that I do not express my thoughts clearly enough. I'm really working on it and you should not hesitate to ask when something is hard to understand.
Don't worry about that. I don't have any problem to understand what you write. As soon as I had learned a little bit more about the HTTP protocol it was perfectly clear ;)

/Daniel

Reply via email to