Re: mod_cache: Don't update when req max-age=0?
On Tue, 22 May 2007, Henrik Nordstrom wrote: tis 2007-05-22 klockan 11:40 +0200 skrev Niklas Edmundsson: -8--- Does anybody see a problem with changing mod_cache to not update the stored headers when the request has max-age=0, the body turns out not to be stale and the on-disk header hasn't expired? -8--- My understanding: It's fine in an RFC point of view for the cache to completely ignore a 304 and not update the stored entity at all. But the response to this request should be the merge of the two responses assuming the conditional was added by the cache. This is in line with my understanding, and since the response-merging is being done today the only change that would be done is to skip storing the header to disk. I think it would be wise to only skip the storing for the max-age=0 case though. Should I try to whip up a patch for it then? /Nikke -- -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- Niklas Edmundsson, Admin @ {acc,hpc2n}.umu.se | [EMAIL PROTECTED] --- Radioactive halibut will make fission chips. =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
Re: mod_cache: Don't update when req max-age=0?
On 5/24/07, Niklas Edmundsson [EMAIL PROTECTED] wrote: On Tue, 22 May 2007, Henrik Nordstrom wrote: tis 2007-05-22 klockan 11:40 +0200 skrev Niklas Edmundsson: -8--- Does anybody see a problem with changing mod_cache to not update the stored headers when the request has max-age=0, the body turns out not to be stale and the on-disk header hasn't expired? -8--- My understanding: It's fine in an RFC point of view for the cache to completely ignore a 304 and not update the stored entity at all. But the response to this request should be the merge of the two responses assuming the conditional was added by the cache. This is in line with my understanding, and since the response-merging is being done today the only change that would be done is to skip storing the header to disk. I think it would be wise to only skip the storing for the max-age=0 case though. Why limit it to the the max-age=0 case? Isn't it a general improvement? Sander
Re: mod_cache: Don't update when req max-age=0?
On Thu, May 24, 2007 10:23 am, Sander Striker wrote: It's fine in an RFC point of view for the cache to completely ignore a 304 and not update the stored entity at all. But the response to this request should be the merge of the two responses assuming the conditional was added by the cache. This is in line with my understanding, and since the response-merging is being done today the only change that would be done is to skip storing the header to disk. I think it would be wise to only skip the storing for the max-age=0 case though. Why limit it to the the max-age=0 case? Isn't it a general improvement? It isn't - the nett effect of not storing the headers to disk, means that once a fresh object goes stale, it will remain stale until the end of days, because the mechanism to make that object fresh again has been removed. If the object remains stale, it means that a conditional request will be generated and sent to the backend on every single hit, which is unnecessary load on both the backend network and the backend webserver. As a directive controlled special case, this feature makes sense - but this isn't the kind of default behaviour you want to see on a cache. A better approach might be to determine whether the headers have actually changed before writing them to disk. You needed to read the header in in the first place, if the previously-read header and the newly-received header from the backend are the same, then don't write to disk, it's unnecessary. This remains RFC compliant and solves the underlying problem. Regards, Graham --
Re: mod_cache: Don't update when req max-age=0?
On Thu, 24 May 2007, Sander Striker wrote: -8--- Does anybody see a problem with changing mod_cache to not update the stored headers when the request has max-age=0, the body turns out not to be stale and the on-disk header hasn't expired? -8--- My understanding: It's fine in an RFC point of view for the cache to completely ignore a 304 and not update the stored entity at all. But the response to this request should be the merge of the two responses assuming the conditional was added by the cache. This is in line with my understanding, and since the response-merging is being done today the only change that would be done is to skip storing the header to disk. I think it would be wise to only skip the storing for the max-age=0 case though. Why limit it to the the max-age=0 case? Isn't it a general improvement? Consider a default cache lifetime of 86400 seconds, and requests coming in with max-age=4 (we see a lot of mozilla downloads with this, for example). If you don't rewrite the on-disk headers you'll end up always hitting your backend when you pass an age of 4. In the max-age=0 case you only force an unneccesary header write, because: a) The written header won't be useful for other requests with max-age=0. A ground rule of caching is to not save stuff that's never used. b) Requests with max-age!=0 aren't helped much by it, the only penalty would be when an max-age!=0 request causes a header rewrite that an max-age=0 access would have performed. Doing this single rewrite instead of potentially thousands if rewriting due to max-age=0 is a rather big win. c) RFC-wise it seems to me that a not-modified object is a not-modified object. There is no guarantee that next request will hit the same cache, so nothing can expect a max-age=0 request to force a cache to rewrite its headers and then access it with max-age!=0 and get headers of that age. d) Also, an object tend to be accessed with more-or-less the same max-age. So to store headers in the max-age=0 case just because it might be accessed by max-age!=0 makes no sense, since it's more likely that the next request to this object will have the same max-age. /Nikke -- -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- Niklas Edmundsson, Admin @ {acc,hpc2n}.umu.se | [EMAIL PROTECTED] --- Did I just step on someones toes again?? =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
Re: mod_cache: Don't update when req max-age=0?
tor 2007-05-24 klockan 13:22 +0200 skrev Niklas Edmundsson: c) RFC-wise it seems to me that a not-modified object is a not-modified object. There is no guarantee that next request will hit the same cache, so nothing can expect a max-age=0 request to force a cache to rewrite its headers and then access it with max-age!=0 and get headers of that age. Yes. RFC wise it's fine to not update the cache with the 304. Updating of cached entries is optional (RFC2616 10.3.5 last paragraph). The only MUST regardig 304 and caches is that you MUST ignore the 304 and retry the request without the conditional if the 304 indicates another object than what is currently cached (i.e. ETag or Last-Modified differs). (same section, the paragraph above) Regards Henrik signature.asc Description: Detta är en digitalt signerad meddelandedel
Re: mod_cache: Don't update when req max-age=0?
tis 2007-05-22 klockan 11:40 +0200 skrev Niklas Edmundsson: -8--- Does anybody see a problem with changing mod_cache to not update the stored headers when the request has max-age=0, the body turns out not to be stale and the on-disk header hasn't expired? -8--- My understanding: It's fine in an RFC point of view for the cache to completely ignore a 304 and not update the stored entity at all. But the response to this request should be the merge of the two responses assuming the conditional was added by the cache. Regards Henrik signature.asc Description: Detta är en digitalt signerad meddelandedel
Re: mod_cache: Don't update when req max-age=0?
On Mon, May 21, 2007 4:49 pm, Niklas Edmundsson wrote: Does anybody see a problem with changing mod_cache to not update the stored headers when the request has max-age=0, the body turns out not to be stale and the on-disk header hasn't expired? The rationale behind this is that there are hordes of stupid download managers that always issue this kind of request, and multiple in parallell to the same file at that. This hammers the entire cache-layer by causing headers to be rewritten for each request. Since max-age=0 requests can't be fulfilled without revalidating the object they don't benefit from this header rewrite, and requests with max-age!=0 that can benefit from the header rewrite won't be affected by this change. Am I making sense? Have I missed something fundamental? At first glance, doing this I think will break RFC2616 compliance, and if it does break RFC compliance then I think it should not be default behaviour. However if it does solve a real problem for admins, then having a directive allowing the admin to enable this behaviour does make sense. Zooming out a little bit, this seems to fall into the category of RFC violations that allow the cache to either hit the backend less, or hit the backend not at all, for the benefit of an admin who knows whet they are doing. A simple set of directives that allow an admin to break RFC compliance under certain circumstances in order to achieve certain goals does make sense. Regards, Graham --
Re: mod_cache: Don't update when req max-age=0?
On Mon, 21 May 2007, Graham Leggett wrote: Since max-age=0 requests can't be fulfilled without revalidating the object they don't benefit from this header rewrite, and requests with max-age!=0 that can benefit from the header rewrite won't be affected by this change. Am I making sense? Have I missed something fundamental? At first glance, doing this I think will break RFC2616 compliance, and if it does break RFC compliance then I think it should not be default behaviour. However if it does solve a real problem for admins, then having a directive allowing the admin to enable this behaviour does make sense. Why would it break RFC compliance? This request will never benefit of the headers being saved to disk, and the headers returned to the client should of course be those that resulted of the revalidation of the object. The only difference is that they aren't saved to disk too. The only difference I can see is that you can't probe that the previous request was a max-age=0 by doing max-age!=0 request afterwards... Zooming out a little bit, this seems to fall into the category of RFC violations that allow the cache to either hit the backend less, or hit the backend not at all, for the benefit of an admin who knows whet they are doing. A simple set of directives that allow an admin to break RFC compliance under certain circumstances in order to achieve certain goals does make sense. Yup. CacheIgnoreCacheControl is one of those, we use it on the offloaders that only serves large files that we know doesn't need the RFC behaviour. /Nikke -- -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- Niklas Edmundsson, Admin @ {acc,hpc2n}.umu.se | [EMAIL PROTECTED] --- Sir, We are receiving 285,000 Hails. þ Crusher =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
Re: mod_cache: Don't update when req max-age=0?
On May 21, 2007, at 7:49 AM, Niklas Edmundsson wrote: Does anybody see a problem with changing mod_cache to not update the stored headers when the request has max-age=0, the body turns out not to be stale and the on-disk header hasn't expired? Yes, the problem is that it will break content management systems that need to refresh a cache front-end after the content has changed. The rationale behind this is that there are hordes of stupid download managers that always issue this kind of request, and multiple in parallell to the same file at that. This hammers the entire cache-layer by causing headers to be rewritten for each request. Why don't you just add an ignore of cache-control on requests from those stupid download managers? A simple BrowserMatch should do. Roy
Re: mod_cache: Don't update when req max-age=0?
Niklas Edmundsson wrote: At first glance, doing this I think will break RFC2616 compliance, and if it does break RFC compliance then I think it should not be default behaviour. However if it does solve a real problem for admins, then having a directive allowing the admin to enable this behaviour does make sense. Why would it break RFC compliance? Because when clients say maxage=0 it means please consider all URLs as stale and revalidate them, and the server is obliged to honor this. This request will never benefit of the headers being saved to disk, and the headers returned to the client should of course be those that resulted of the revalidation of the object. The only difference is that they aren't saved to disk too. If this happens you introduce a subtle bug - when the URL becomes stale on the frontend, it will remain stale to the end of days, because the entry on disk is never refreshed with new headers to show the content is fresh. Yup. CacheIgnoreCacheControl is one of those, we use it on the offloaders that only serves large files that we know doesn't need the RFC behaviour. I was thinking of a directive like CacheOrigin [on|off], meaning that *this* cache isn't a cache at all, but rather an origin server that just happens to fetch data via HTTP from some backend if the data isn't fresh in the cache. Regards, Graham -- smime.p7s Description: S/MIME Cryptographic Signature
Re: mod_cache: Don't update when req max-age=0?
On May 21, 2007, at 2:22 PM, Ruediger Pluem wrote: Why don't you just add an ignore of cache-control on requests from those stupid download managers? A simple BrowserMatch should do. I am not quite sure what you mean by this. AFAIK you cannot set CacheIgnoreCacheControl based on env variables. Which is why we would have to add it to the code. Note that this would be to ignore client-provided cache control, which is a good feature to have on a cache for various DoS reasons. Roy