Hi, Wow !
I didn't expect to have this discussion get in this direction, but excellent ! For illustration what I originally had in mind, I have commited my prototype in [1]. Please note, that this *only* is about setting the Last-Modified and Cache-Control headers. Now, taken a step further: do we really want to build a cache into Sling ? Shouldn't we rather rely on some existing caching proxy for this, like Squid or mod_cache/mod_proxy ? As for what to cache (if we cache): I think we should not cache requests with Queries, such requests are by definition not cacheable. Having a multi-dimensional cache taking requesting users into account is also an interesting thing. My fear is, that we run into a performance drain just to manage the cache .... Regards Felix [1] http://svn.apache.org/repos/asf/sling/whiteboard/fmeschbe/cachecontrol On Wed, Apr 28, 2010 at 4:02 PM, Eric Norman <[email protected]> wrote: > Hi all, > > In general, I like the idea of a server side cache. However, I agree with > Vidar that a cache without resource tracking has limited usefulness in a > real system. > > In the past I had implemented something similar. > > The key parts I remember were: > > - I used a (slightly) modified version of the OSCache library for > managing the cache: http://www.opensymphony.com/oscache/ > - Cache only for GET requests > - The cacheKey had to contain (at a minimum) the following information: > 1. Is the current user logged in? (anonymous vs. real user) > 2. What groups is the current user a member of (in case ACLs affect > what is rendered). Also, the ACEs for all the resources used to > render the > response would need to use group principals instead of > individual userids to > make the cache value reusable by more users. > 3. The current theme, language, or other options from the user > preferences that may affect how the page is rendered. > 4. A version of the requested query string that has been sorted (in > case the params come in a different order). > 5. Filter out "jsessionid" if it is present on the url > - When rendering the page keep track of all the resources used to render > the page. Using the OSCache APIs, the resources were tracked by adding the > resource path as a 'group' on the cache entry. > - Special handling is need for cache invalidation during ACL changes in > case changing the ACL causes the content of the page to change. > - Sometimes tracking resources used is not sufficient as you may have a > page that is listing the children of a container. Adding a new child to the > container would also need to invalidate the cache entry. To handle this, > pages that do such things would need to add a container 'group' to the cache > entry (cacheEntry.addGroup(container:[resourcePath]). > - Use a (Synchronous) JCR Observer to listen for changes to resources. > If a change is detected, invalidate any cache entries that reference the > changed resource (or entries that track the parent container). In OSCache > this is done by flushing the group (the resource path) to invalidate any > entries that reference the group path > - During the rendering of the page there should be some way for the > script to indicate that it should not be cached. > - Sometimes caching the whole page is not possible if the page contains > user specific text (for example, username in the page header) but it may be > possible to cache fragments of the page instead. > > > Anyways, that's my 2 cents. > > Regards, > Eric > > On Wed, Apr 28, 2010 at 4:35 AM, Vidar Ramdal <[email protected]> wrote: > >> On Wed, Apr 28, 2010 at 1:13 PM, Felix Meschberger >> <[email protected]> wrote: >> > Hi all, >> > >> > I have been resonating with a collegue about a request level Filter >> > for Sling to support caching. >> > >> > The idea (and partly implemented by a prototype) is to have the >> > request filter setup default caching behaviour of the response (if the >> > response is cacheable at, that is the request method must be GET and >> > there are no request parameters): >> > >> > * The Cache-Control header is preset with values from configuration >> > matching the request URI (or resource path) >> > * The Last-Modified header is preset with the jcr:lastModified >> > property of the requet's resource >> > * Eager responding with 304/NOT MODIFIED if the If-Modified-Since >> > header is set and a last modification time of the resource can be >> > resolved. >> >> The question is how useful such a filter would be if only the >> last-modified date of the requested resource is used. >> >> In our application at least, there is a large number of resources >> involved when serving a request. Most CMSs list out menus, for >> example, where the menu items are other resources. If one of those >> resources have changed, or if there has been a new menu item created, >> it means the menu will be out of date if the requested resource itself >> is unmodified. >> >> To solve this, we could introduce a resource tracker, which tracks >> which resources are being invoked on a request. The latest >> last-modified date of these resources will then be matched with the >> requests If-Modified-Since header. >> >> -- >> Vidar S. Ramdal <[email protected]> - http://www.idium.no >> Sommerrogata 13-15, N-0255 Oslo, Norway >> + 47 22 00 84 00 / +47 21 531941, ext 2070 >> >
