On 18/10/2013 12:05 PM, Brian May wrote:
Perhaps I need to give some some specific examples.


Example 1: we need to display the user some data, however this is slow, so we try to cache it:

try:
   cache = Cache.objects.get(start=?, stop=?)
except Model.DoesNotExist:
   data = get_data(start=?,stop=?)
cache = Cache.objects.create(start=?, stop=?, xxx=data.xxx, yyy=data.yyy, ...)
[ render response using cache ]



I would set up a table (guard_get_data say), indexed on (start, stop) with a timestamp field.

try:
   cache = Cache.objects.get(start=?, stop=?)
except Model.DoesNotExist:
   try:
      insert into guard_get_data (start, stop, now)    # not real code
      try:
         data = get_data(start=?,stop=?)
cache = Cache.objects.create(start=?, stop=?, xxx=data.xxx, yyy=data.yyy, ...)
      finally:
         delete (start, stop) from guard_get_data  # not real code
   except insert error:
      # already being generated
wait for guard_get_data record on (start, stop) to be deleted # not real code
    # the data should be in the cache now
     cache = Cache.objects.get(start=?, stop=?)

Having the timestamp field means you can check that the get_data_guard record is not too far in the past. If it is it probably means that python crashed without deleting the record, or the generation process is spinning in another thread/process.

The insert into get_data_guard should be completely atomic - it will only ever succeed for one caller, and so it should eliminate the race condition.

I've not used django's orm in ages, so I don't know what method it used for atomic inserts, or whether it would be better to drop down to the sql level.

Cheers,

Rasjid.

_______________________________________________
melbourne-pug mailing list
[email protected]
https://mail.python.org/mailman/listinfo/melbourne-pug

Reply via email to