On 2010-08-05 07:59, Martin Pool wrote:
If the main problem is "user changes something but doesn't see the change
reflected in the page," that's the same problem we have with replication
lag. Couldn't we solve that in the same way, by having a user bypass
memcached for a while after a POST?
We could. It would be a fairly cute way to solve the "but I thought I
just changed that?" bug, but it's not a a total solution: "but Jeroen,
I thought you said you fixed that?" On the whole I would still call
it a bandaid.
Actually I think it does solve the problem when the cached lifetime is
shorter than the time it takes for me to tell you I've done it and for
you to load up the page with the expectation to see the work done. This
is why I'm suggesting very short expiry times.
Of course there's also replication lag, browser caching, transparent
proxies, and the reverse proxy so I am taking a bit of an
I-don't-need-to-outrun-the-bear-I-just-need-to-outrun-you view. If skew
from those other things aren't a problem now, I'm saying we could
cheaply ensure that memcached doesn't make things worse.
One thing we could do is to use feature flags to turn on or off TAL
caching, so that we can make the correctness/throughput tradeoff
dynamically when we're being slashdotted. (Again, flickr etc
apparently use this technique.)
I'd want it enabled all the time--but with expiry time set just long
enough to take the edge off a load spike for very specific fragments.
Could be as short as a second for all I care.
When slashdot strikes, I would _not_ want our users to time out until
the oopses show up in our email the next morning, and some enterprising
engineer checks the referrer, and the problem is debated with IS, and
decisions are held off until someone responsible comes online, and then
caching is enabled either by cowboy patch or a multi-handoff review
procedure, and finally the jolly lot of us figure out whether any
glitches are due to the spike, to a systemic failure, or to pre-existing
problems that we were hiding because caching was disabled.
Maybe I'm over-focusing on the Sudden Deadly Spike. I just find it a
useful way to think about memcached because it removes all temptation to
reduce timeout counts without fixing latency. It's also what makes me
feel that low hit rates are fine for normal days--as long as they shoot
up (and app/db load stays relatively steady) when Google's Doodle of the
Day happens to link to Bug #1.
Jeroen
_______________________________________________
Mailing list: https://launchpad.net/~launchpad-dev
Post to : [email protected]
Unsubscribe : https://launchpad.net/~launchpad-dev
More help : https://help.launchpad.net/ListHelp