Graham Leggett writes: > > >>> The first two are a lot like the Header directive from > > >> mod_headers, in > > >>> that they allow you to set|unset|add|append to a given header. They > > >>> also take an optional regular expression argument, which is matched > > >>> against r->uri to control application of the directive. > > Is there a reason why mod_headers can't be asked or modified to do this? > > This is useful functionality I will agree, but I don't think adding more > directives to do almost the same thing as we can already do is a good > idea.
Hi, These directives are all intended to be used as part of a reverse/caching proxy setup. The mod_headers module does indeed allow one to set the incoming headers, but there are a few problems: 1) It is not as flexible as one might like -- in particular, I added an optional pattern-match field which is matched against the request uri. If the uri satisfies the pattern match, the header directive is applied. If not, not. 2) It only works on the incoming headers. I wrote code to supply symmetrical directives that set both incoming and outgoing headers. 3) It is not clear how mod_headers and mod_proxy should work together. Both modules implement fixup handlers. Potential ordering problems or other conflicts seem not unlikely. > > > >>> The last two are special-purpose directives that control the setting > > >>> of Expires/Cache-Control and Date headers, respectively. They also > > >>> take optional regular expressions to limit their application at > > >>> request time. > > ProxyResponseExpiresVector > CacheFreshenDate > > Can you explain what the above directives do? Sure. CacheFreshenDate, if 'On', sets the Date header to be current when a document is returned from the cache. (The default 'Off', is the same as the regular mod_proxy behavior, which is not to change any headers at all, including the Date header.) ProxyResponseExpiresVector allows one to decouple the internal mod_proxy caching behavior from the caching recommendations that are sent to the outside world. ProxyResponseExpiresVector takes an argument '<seconds>', which it uses to update the Expires and Cache-Control:max-age headers on proxy responses to reflect expiration "seconds" into the future. It takes an additional optional argument '<pattern-match>', which, as in the header-set/unset directives above, is matched against the request uri to control application of the directive. We use this to tell the world at large to cache some of our heavily-dynamic entry pages for a shorter time than we cache them internally. We do so for two reasons: 1) We set the times-to-live for most of our pages to 60 seconds, even on pages that change, on average, every 15 minutes. We do this because we depend on both advertising revenue and on "traffic growth and credibility" to support our work (distributing content from 85+ African publishers to a global audience -- most of our publishers would not be able to reach this audience or to generate revenue from such distribution without us). Both advertising and investor/partner/public perception are heavily effected by "audited" traffic metrics. The audited (and I use the term very loosely <sigh>) traffic information comes from our log files. While I would very much prefer not to engage in even this relatively non-aggressive form of cache-busting, we don't really have a lot of choice. When we experimented with longer ttl's, our traffic dropped significantly. 2) Some of our heavily-used and updated news pages have reasonable times-to-live of three to five minutes. So that's how long we want mod_proxy to cache them. Unfortunately (for reasons that are not entirely clear but that perhaps have something to do with the non-freshened date behavior mentioned above), we were seeing big spikes in accesses around the expiration times of the most heavily used of these pages. The pages take a couple of seconds to construct themselves, and there was a nasty pile-up when each traffic peak coincided with the proxy finding a stale copy in the cache, leading to multiple requests in quick succession to our backend server before a new copy could be placed in the cache. We were tearing our hair out. Setting ProxyResponseExpiresVector to 30 seconds has made the problem largely disappear. I think that this is because the accesses are spread out more, and the majority of them get a page returned from the cache, which happens very, very quickly and causes no pushing and shoving at the backend! I hope this clarifies the intent behind my patch. Documentation, with examples, can be found here: http://allafrica.com/tools/apache/mod_proxy/mod_proxy.html#cachefreshendate In general, patches for all the relevant files (including the manual page) are at: http://allafrica.com/tools/apache/mod_proxy/ Thanks again, Kwin
