It probably doesn't surprise you that how NDB handles caching is more complicated (and sensible) than the docs summarise: Specifically, to avoid the problem you highlight, NDB puts sentinel '_LOCKED' value into memcache at certain points, to signal to other readers that an update is taking place, and that they should not put anything back into cache. In conjunction, the sequence is more like:
1. Instance 1 determines it needs to put changes to an object into the datastore. 2. Instance 2 determines it needs to get the same object from datastore. 3. Instance 1 *sets the memcache value to _LOCKED* 4. Instance 2 queries the memcache, and finds *_LOCKED* 5. Instance 2 loads the old version from the datastore. 6. Instance 1 puts the new version in the datastore. 7. *Because it found _LOCKED in memcache, 2 doesn't set the value from datastore back in memcache* 8. *1 deletes the memcache value* NDB also makes use of the 'CAS' facility of the memcache service[1][2], that prevents certain re-orderings of this sequence, or failures at the memcache steps, from causing problems. [1] https://cloud.google.com/appengine/docs/python/memcache/#Python_Using_compare_and_set_in_Python [2] http://neopythonic.blogspot.co.uk/2011/08/compare-and-set-in-memcache.html On Friday, 18 March 2016 20:56:05 UTC, Nick (Cloud Platform Support) wrote: > > Hey Joel, > > Your question fits more on the "general discussion" side of the sometimes* > ambiguous boundary where it might be too in-depth for Stack Overflow, with > too many possible answers, and yet is also a sort-of-specific issue which > tends to get redirected to Stack. In the spirit of helping you out > regardless, I'll be happy to assist here with some advice. > > Cache invalidation is traditionally held up as a class of problem in > computing which is devilishly hard to get right. The cardinal sin is to > have the system perceive stale data as fresh, although it's acceptable to > fetch stale data as long as it's identifiable as such and the path to fresh > data is relatively cheap. > > The race condition you describe is definitely something to avoid, and I > see one possible solution: > > The deletion of the memcache key by instance 1 is clearly meant to prepare > the way for a put() to the datastore of the new entity, so we should make > the composition of these two actions into one atomic action, so that > instance 2 will fail to fetch from memcache and thereafter find the new > value in datastore. This can be accomplished as follows: > > Rather than deleting the memcache key before put, a timestamp can be put > in its place (or somehow associated with it), only to be deleted once the > datastore entity is safely put. This means that instance 2, attempting to > access the memcache key, would notice that it's currently being updated and > do extremely small incremental sleeps until it finds the new version > available in the datastore. > > Could you go into a little more detail about your system specifically? I > feel this will help spur more concrete discussion of the trade-offs you can > make in your specific situation. > > Regards, > > Nick > Cloud Platform Community Support > > > > On Thursday, March 17, 2016 at 1:27:46 PM UTC-4, Joel Holveck wrote: >> >> By the way, feel free to say this should be on Stack Overflow if that's >> more appropriate. I still don't have a good feel for what should be posted >> here vs. elsewhere. >> > -- You received this message because you are subscribed to the Google Groups "Google App Engine" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at https://groups.google.com/group/google-appengine. To view this discussion on the web visit https://groups.google.com/d/msgid/google-appengine/b4288507-28f4-4edf-a3cd-03fea03f46c9%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
