What my initial idea for this was:

we feed the 'soon to be expired' URLs into a priority queue. (similar to mod-mem-cache's)

a pool of threads read the queue and start fetching the content, and re-filling the cache with fresh responses.

the benefit of this method would be that we control exactly how hard we had the back ends, as well as fetching the important stuff first.

this is slightly different than where this thread is going.

I'm open to both, but I think the method below could still result in swamping the backend server when lots of unique URLs get requested.

--Ian

Parin Shah wrote:
We have been down this road.  The way one might solve it is to allow
mod_cache to be able to reload an object while serving the "old" one.

Example:

cache /A for 600 seconds

after 500 seconds, request /A with special header (or from special client,
etc) and cache does not serve from cache, but rather pretends the cache has
expired.  do normal refresh stuff.

The cache will continue to server /A even though it is refreshing it



As Graham suggested, such mechanism will not refresh the pages those
are non-popular but expensive to load. which could incur lot of
overhead. But, other than that, This looks really good solution.


Also, one of the flaws of mod_disk_cache (at least the version I am looking
at) is that it deletes objects before reloading them.  It is better for many
reasons to only replace them.  That's the best way to accomplish what I
described above.


If we implement it the way you suggested, then this problem would
automatically be solved.

-Parin.


Reply via email to