Re: Few queries on atomicity of requests

Dustin Sallings Wed, 18 Jun 2008 23:28:20 -0700


On Jun 18, 2008, at 23:02, Daniel wrote:

I have a couple other ideas I'll share in case anyone likes them...

1.) Let the application decide when an object is "too stale."

Currently memcached is setup so an expired element is never available,
even though it's still in memory.

Perhaps another way memcached could work is to report pack, with the
data, the age of the data.  Then the app can decide if that is too old
or not, based on it's needs, and refresh as necessary.


        This is one of the processes that was described already.

Adding further metadata to the cache and changing the protocol toreturn it isn't really an option at this point, but it's easy to addto the data in your application.

2.) Rather than having a dog pile, you could set a magic "I'm getting
that" which is written to the cache on a miss. (best if it's even part
of the original get request actually) Other processes, rather than
jumping to the database just wait in a loop with some random timeouts
calling memcached repeatedly until the data is available.


        You can do that today with a derived key.

Now, for the big question I've been sitting on for months...

Has anyone worked out a system using memcached that guarantee's

memcached has the most recent data? Essentially, tying memcached tothe

database so memcached is never allowed to contain any stale data?


        Yes, many people have done things like this.

Well, guarantee is actually a bit of a difficult thing to say becauseit's not like you get 2pc or anything, but I used to have anapplication that would push cache updates through as part of DBupdates. I'd actually push the cache updates through *before* the DBwrites because the DB writes were async and conflicts were resolvable.

If not that, then you can at least have a post-transaction cachereplacement (cache_fu in rails supports this out of the box, and I'vebuild similar things for java a few times).

I've looked at some soutions involving a timestamp with everyrecord, arevision code, database row locking, etc. I think I've determinedthatit can be made to work work with the data itself, by disabilingcaching
when multiple writes are being processed, however I was hoping to find
out if anyone's actually made it work.

Why would you disable caching just because something's writing?There's always a last write.

Having a version column (I used to call it a ``write token'' orsomething like that) ensures that you are writing against the correctdata.

One option to ensure correctness to always read from the DB whendoing a write and do a three way merge between the state you wereoriginally in, the state you were trying to push and the state of therecords in the DB currently. It all depends on what you're doing.

From what I understand, a system like this can only work if every
application that accesses data does it part, but I haven't seen any
proven examples, and it seems to be a highly complex interface that
would require some real amazing programming magic.

I built a lock server I do similar things with. You can create crossmachine locks to mutually exclude operations that need to beserialized across multiple systems (e.g. I use it for async jobs thatperform search index update and propagation). It's not meant to behugely fast, so I wouldn't do it for every single row, but I haven'tfound a need for such a thing yet.

Most of this is just tired ramblings of someone guessing atrequirements, though. Once there are particular constraints for anapplication, the mechanism to ensure correctness becomes more clear.


--
Dustin Sallings

Re: Few queries on atomicity of requests

Reply via email to