[google-appengine] Re: NDB strategy for keeping caches in sync with the datastore

Dan Mon, 11 Nov 2013 11:51:47 -0800

Great references. I had not seen them before and they explain a lot. I was 
puzzled by why _LOCK_TIME was so long at 32 seconds but now I know it is to 
cater for the maximum datastore retry length of 30 seconds plus a little 
bit extra.


It looks like Guido had a difficult time of it before memcache compare and 
swap was available in the Python runtime. Thanks for the tips.

On Monday, November 11, 2013 6:20:24 PM UTC, Alex Burgel wrote:
>
> Thanks for writing this up. I had been trying to figure this out myself.
>
> I think your reasoning is correct. There is also the case of deleting an 
> entity from one client and another client causing it to be added back to 
> memcache. _LOCKED should help in that case too. I think the key to all this 
> is the timeout duration, because if you have one very slow client, it could 
> still put old data into the cache.
>
> I came across some ndb issues that talk about some of these issues:
>
> https://code.google.com/p/appengine-ndb-experiment/issues/detail?id=17
> https://code.google.com/p/appengine-ndb-experiment/issues/detail?id=84
>
> Also, I found this Facebook post on how they replaced memcache. It also 
> discusses similar issues:
>
> https://www.facebook.com/note.php?note_id=10151347090423920
>
> --Alex
>
> On Monday, November 11, 2013 10:25:24 AM UTC-5, Dan wrote:
>>
>> Having thought about this a bit, I think I understand why 
>> _LOCKED<https://code.google.com/p/appengine-ndb-experiment/source/browse/ndb/context.py#27>
>>  needs 
>> to be used with NDB memcache to keep the datastore and memcache in sync 
>> within a transaction. FYI, NDB checks local memory then memcache and then 
>> the datastore for an entity.
>>
>> Suppose I implemented the naive approach mentioned above. I clear out all 
>> memcached entities affected at the start of a transaction and then after 
>> the transaction succeeds I repopulate memcache with the updated entities. 
>> This method *would* give stale data if memcache fails to repopulate at 
>> the end of the transaction.
>>
>> For example, if I have an entity MyEntity{int_property: 0} and I want to 
>> increment int_property by 1 transactionally.
>>
>>
>>    1. Starting point: MyEntity{int_property: 0}  is in memcache and the 
>>    datastore.
>>    2. Transaction begins.
>>    3. Delete MyEntity{int_property: 0} from memcache.
>>    4. Get MyEntity{int_property: 0} from datastore.
>>    5. Put MyEntity{int_property: 0} to MyEntity{int_property: 1}
>>    6. Transaction succeeds.
>>    7. Place MyEntity{int_property: 1} into memcache.
>>    
>> What happens if between the start and end of the transaction, an external 
>> Get request repopulates memcache with MyEntity{int_property: 0}. That's 
>> fine because step 7 will overwrite that memcache entity when the 
>> transaction succeeds. However, what if step 7 *fails*? Everyone will be 
>> reading stale values (MyEntity{int_property: 0}) from memcache despite the 
>> transaction succeeding.
>>
>> I imagine _LOCKED is used to prevent this from happening:
>>
>>    1. Starting point: MyEntity{int_property: 0}  is in memcache and the 
>>    datastore.
>>    2. Transaction begins.
>>    3. Lock MyEntity{int_property: 0} key in memcache.
>>    4. Get MyEntity{int_property: 0} from datastore.
>>    5. Put MyEntity{int_property: 0} to MyEntity{int_property: 1}
>>    6. Transaction succeeds.
>>    7. Unlock MyEntity key from memcache and place MyEntity{int_property: 
>>    1} into it.
>>
>> With this method a Get request external to the transaction will go 
>> straight to the datastore for its entity as it will see that memcache is 
>> locked. The important bit is that if the transaction succeeds in step 6 but 
>> memcache Set fails in step 7 we have no problem of data consistency as all 
>> Get requests will still see the memcache lock and use the underlying 
>> datastore. The unfortunate side effect is that memcache will be out of 
>> service for that entity however I see a 
>> _LOCK_TIME<https://code.google.com/p/appengine-ndb-experiment/source/browse/ndb/context.py#26>
>>  variable 
>> which will timeout memcache after a reasonable period in order to put it 
>> back in action.
>>
>> I know I could probably just use pdb and step through NDB but it is more 
>> fun to figure it out for myself. Anyone know if I am on the right track 
>> with this?
>>
>> My motivation is that the Go SDK saves me money on instances but loses me 
>> money on datastore access. There are several Go libraries around but none 
>> of them seem to be as rigorous or useful as NDB. After having tried several 
>> of them I am back to using "appengine/datastore" and slowly crafting each 
>> datastore hotspot which is a pain and error prone.
>>
>>
>> On Wednesday, November 6, 2013 8:09:31 PM UTC, Dan wrote:
>>>
>>> Would someone be able to explain to me the strategy that NDB uses to 
>>> keep its memory, memcache and datastore entities in sync and consistent 
>>> (especially during transactions)?
>>>
>>> I can't quite figure out from the 
>>> code<https://code.google.com/p/appengine-ndb-experiment/> what 
>>> goes on.
>>>
>>> For example, before the start of a transaction, does NDB delete memory 
>>> and memcache entities that will be affected by the transaction and then 
>>> repopulate them if the transaction succeeds?
>>>
>>> I see reference to a 
>>> _LOCKED<https://code.google.com/p/appengine-ndb-experiment/source/browse/ndb/context.py#27>
>>>  value 
>>> to lock memcache. What is this used for and what happens if the unlock 
>>> operation fails?
>>>
>>> Best wishes,
>>> Dan
>>>
>>

-- 
You received this message because you are subscribed to the Google Groups 
"Google App Engine" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at http://groups.google.com/group/google-appengine.
For more options, visit https://groups.google.com/groups/opt_out.

[google-appengine] Re: NDB strategy for keeping caches in sync with the datastore

Reply via email to