Hi,

Wow !

I didn't expect to have this discussion get in this direction, but excellent !

For illustration what I originally had in mind, I have commited my
prototype in [1].

Please note, that this *only* is about setting the Last-Modified and
Cache-Control headers.

Now, taken a step further: do we really want to build a cache into
Sling ? Shouldn't we rather rely on some existing caching proxy for
this, like Squid or mod_cache/mod_proxy ?

As for what to cache (if we cache): I think we should not cache
requests with Queries, such requests are by definition not cacheable.
Having a multi-dimensional cache taking requesting users into account
is also an interesting thing.

My fear is, that we run into a performance drain just to manage the cache ....

Regards
Felix

[1] http://svn.apache.org/repos/asf/sling/whiteboard/fmeschbe/cachecontrol


On Wed, Apr 28, 2010 at 4:02 PM, Eric Norman <[email protected]> wrote:
> Hi all,
>
> In general, I like the idea of a server side cache.  However, I agree with
> Vidar that a cache without resource tracking has limited usefulness in a
> real system.
>
> In the past I had implemented something similar.
>
> The key parts I remember were:
>
>   - I used a (slightly) modified version of the OSCache library for
>   managing the cache: http://www.opensymphony.com/oscache/
>   - Cache only for GET requests
>   - The cacheKey had to contain (at a minimum) the following information:
>      1. Is the current user logged in? (anonymous vs. real user)
>      2. What groups is the current user a member of (in case ACLs affect
>      what is rendered).  Also, the ACEs for all the resources used to
> render the
>      response would need to use group principals instead of
> individual userids to
>      make the cache value reusable by more users.
>      3. The current theme, language, or other options from the user
>      preferences that may affect how the page is rendered.
>      4. A version of the requested query string that has been sorted (in
>      case the params come in a different order).
>      5. Filter out "jsessionid" if it is present on the url
>   - When rendering the page keep track of all the resources used to render
>   the page.  Using the OSCache APIs, the resources were tracked by adding the
>   resource path as a 'group' on the cache entry.
>   - Special handling is need for cache invalidation during ACL changes in
>   case changing the ACL causes the content of the page to change.
>   - Sometimes tracking resources used is not sufficient as you may have a
>   page that is listing the children of a container.  Adding a new child to the
>   container would also need to invalidate the cache entry.  To handle this,
>   pages that do such things would need to add a container 'group' to the cache
>   entry (cacheEntry.addGroup(container:[resourcePath]).
>   - Use a (Synchronous) JCR Observer to listen for changes to resources.
>    If a change is detected, invalidate any cache entries that reference the
>   changed resource (or entries that track the parent container). In OSCache
>   this is done by flushing the group (the resource path) to invalidate any
>   entries that reference the group path
>   - During the rendering of the page there should be some way for the
>   script to indicate that it should not be cached.
>   - Sometimes caching the whole page is not possible if the page contains
>   user specific text (for example, username in the page header) but it may be
>   possible to cache fragments of the page instead.
>
>
> Anyways, that's my 2 cents.
>
> Regards,
> Eric
>
> On Wed, Apr 28, 2010 at 4:35 AM, Vidar Ramdal <[email protected]> wrote:
>
>> On Wed, Apr 28, 2010 at 1:13 PM, Felix Meschberger
>> <[email protected]> wrote:
>> > Hi all,
>> >
>> > I have been resonating with a collegue about a request level Filter
>> > for Sling to support caching.
>> >
>> > The idea (and partly implemented by a prototype) is to have the
>> > request filter setup default caching behaviour of the response (if the
>> > response is cacheable at, that is the request method must be GET and
>> > there are no request parameters):
>> >
>> > * The Cache-Control header is preset with values from configuration
>> > matching the request URI (or resource path)
>> > * The Last-Modified header is preset with the jcr:lastModified
>> > property of the requet's resource
>> > * Eager responding with 304/NOT MODIFIED if the If-Modified-Since
>> > header is set and a last modification time of the resource can be
>> > resolved.
>>
>> The question is how useful such a filter would be if only the
>> last-modified date of the requested resource is used.
>>
>> In our application at least, there is a large number of resources
>> involved when serving a request. Most CMSs list out menus, for
>> example, where the menu items are other resources. If one of those
>> resources have changed, or if there has been a new menu item created,
>> it means the menu will be out of date if the requested resource itself
>> is unmodified.
>>
>> To solve this, we could introduce a resource tracker, which tracks
>> which resources are being invoked on a request. The latest
>> last-modified date of these resources will then be matched with the
>> requests If-Modified-Since header.
>>
>> --
>> Vidar S. Ramdal <[email protected]> - http://www.idium.no
>> Sommerrogata 13-15, N-0255 Oslo, Norway
>> + 47 22 00 84 00 / +47 21 531941, ext 2070
>>
>

Reply via email to