I've been investigating using memcached for an upcoming project, and
I'm a little surprised there's not a race-free way to guarantee that
just a single process implements an expensive (to-be-cached)
computation. The fundamental things are that (1) getting the
responsibility for the expensive computation is atomic and race-free
and (2) that subsequent clients can block (as opposed to polling) for
the responsible party.
I recommend naming the command get_locked:
get_locked KEY
... and then the server responds: ...
if the cache has the value, memcached responds:
value_found
VALUE
if it is blocking (aka waiting) on another client to do the expensive
computation, it will respond:
waiting
[then, after the other client finishes], either
success
VALUE
or
error
MESSAGE
if the key KEY is not found at all (or has expired), we respond
compute
then the client must respond:
success EXPIRATION_TIME
VALUE
or
error
MESSAGE
-----
I'm surprised how little I can find on this topic... references to the
stampeding problem (which is resolved by using get_locked) are about
all the info I can find. Sorry in advance if i'm missing some earlier
discussion - i've searched quite a bit and can't find anything about
it.
Are people interested in a patch? It really doesn't seem too hard...