Brian Pane wrote:
> Sure, by that metric, it's a waste of time: the current code can't even
> cache responses that arrive in multiple brigades, which is a prerequisite
> for shadowing incomplete requests. But so what? The cache is very much a
> work in progress; we should judge it by where the code is going, not where
> it is today.
True, but we need to ensure we don't make design decisions now that make
certain features impossible to implement. The old cache code worked
pretty well, but had some flaws that only a rewrite and redesign could
fix. We shouldn't not try implement something only because it is hard :)
> For the expiration case, there's a much easier solution than shadowing the
> incomplete response. Add a new state for cache entries: "being_updated."
> When you get a request for a cached object that's past its expiration date,
> set the cache entry's state to "being_updated" and start retrieving the new
> content. Meanwhile, as other threads handle requests for the same object,
> they check the state of the cache entry and, because it's currently being
> updated, they deliver the old copy from the cache rather than dispatching
> the request to the backend system that's already working on a different
> instance of the same request. As long as the thread that's getting the new
> content can replace the old content with the new content atomically,
> there's
> no reason to make any other threads wait for the new content.
Hmmm... ok, not a bad idea, however I see a few hassles though.
What if a user force-reloads the page. In theory the cached copy should
be expired immediately - but here we don't, because shadow threads need
to access the cached content in the meantime.
Also - there will be a load spike until the first page is cached (in the
case where no previous cache existed).
What happens if an attempt to update an expired cached page hangs? The
proxy could find itself serving stale content for a long time while a
timeout occurs.
> * It's going to take a while to make the shadowing work
> (due to all the race conditions that have to be addressed).
I don't see any significant race conditions though.
All that needs to happen is that all cached responses are
readable-from-the-cache the moment they are created, rather than the
moment the download is complete. A flag against the cache entry marks it
as "still busy". If this flag is set, shadowed threads know to keep
waiting for more data appearing in the cache. If the flag is not set,
the shadow thread is a normal CACHE_OUT case, and finishes up the
request knowing it to be complete.
Shadow support was anticipated in the original design for the current
cache, so should be pretty easy to implement.
> * In the meantime, I need a solution for caching responses
> that don't arrive all in a single brigade. I'd like to
> get this added now, and not wait for the shadowing support.
Shadowing depends on multiple brigades being cachable, not the other way
around, so definitely go ahead with the work on multiple brigades.
Regards,
Graham
--
-----------------------------------------
[EMAIL PROTECTED]
"There's a moon
over Bourbon Street
tonight..."