Re: Cache miss stampedes

dormando Sat, 28 Jul 2007 02:05:04 -0700

Brad: One day I'll use the totally sweet Gearman ;) I linked to yourpost in the FAQ.

Steven: Turns out I'm not as dumb as I thought:http://www.socialtext.net/memcached/index.cgi?faq#how_to_prevent_clobbering_updates_stampeding_requests

... my original FAQ entry opens (clearly, I hope) by saying what yousuggested, then wanders off into alternatives. Which was discussedfurther on the mailing list.

So! Sorry for the back and forth :( Apparantly I forgot what I had justwritten. You're right to question why alternatives even neededdiscussion, but some of us have/had fairly awkward caching primitiveswhich leads to something like this.


-Dormando

Brad Fitzpatrick wrote:

Late to this party, but I have to mention Gearman here.

On a cache miss, instead of going to the database directly, issue a
Gearman request with a "uniq" property, then the Gearman server will
combine all the duplicate requests and only dispatch one worker.  The
worker than puts it in the cache before returning to the Gearman router
(gearmand), and then gearmand multiplexes the result back to all waiting
callers.


On Wed, 25 Jul 2007, dormando wrote:

Hey,

So I'm up late adding more crap to the memcached FAQ, and I'm wondering
about a particular access pattern:

- Key A is hit very often (many times per second).
- Key A goes missing.
- Several dozen processes all get a cache miss on A at the same time,
then run SQL query/whatever, and try set or adding back into memcached.

Sometimes this can be destructive to a database, and can happen often if
the expire time on the data is low for some reason.

What approaches do folks typically use to deal with this more elegantly?
The better suggestion I've heard is to try to 'add' the key (or a
separate 'lock' key) back into memcached, and only doing the query if
you 'win' that lock. Everyone else microsleeps and retries a few times
before running the query.

Also in most of these cases you should really run a tiered cache, with
this type of data being stored in a local cache and in memcached.

This really isn't a common case, but sucks hard when it happens. In the
back of my mind I envision a different style 'get' command, which
defaults to a mutex operation on miss. So you'd do the special 'get',
and if you get a special return code that says it's a miss but you're
clear to update the data (which would release the lock?). Otherwise the
command could optionally return immediately, or hang (for a while) until
the data's been updated.

Just throwing out ideas. Thoughts?

-Dormando

Re: Cache miss stampedes

Reply via email to