Perhaps I need to give some some specific examples.
Example 1: we need to display the user some data, however this is slow, so we try to cache it: try: cache = Cache.objects.get(start=?, stop=?) except Model.DoesNotExist: data = get_data(start=?,stop=?) cache = Cache.objects.create(start=?, stop=?, xxx=data.xxx, yyy=data.yyy, ...) [ render response using cache ] So the first step I can do is make sure start and stop are uniquely indexed. That way if it is run concurrently, the other processes will fail rather then create multiple objects resulting in every request failing. Still not very good from the user's perspective. Ideally, as get_data is a db intensive operation I only want to call it once for a given start/stop. Otherwise we use more resources then required. Also I risk being vulnerable to DOS attacks if I get a lot of requests at the same time (you could argue this is a problem anyway as the start and stop come from the user). I think I could change that to something like (if I understand celery correctly): from app.tasks import get_data try: cache = Cache.objects.get(start=?, stop=?) except Model.DoesNotExist: cache = Cache.objects.create(start=?, stop=?) cache.task = get_data.delay() cache.save() # cache.xxx and cache.yyy to be filled in by celery task if cache.task is not None and not task.ready(): [ render processing message ] else: [ render response using cache ] However, unfortunately, I still have the same race condition. Example 2: I have a photo database that accepts imports from JavaScript. The JavaScript will send a POST request for every file to be uploaded, with the randomly generated name of the album to upload the photo to. At the first step it does: Album.objects.get_or_create(name=?) There is an issue I haven't investigated yet with the JavaScript that for the first upload it will upload the first two files concurrently, despite the fact I configured it to only allow one at a time. Regardless, being able to support concurrent uploads is probably a desirable feature. I can't create a unique index here on name, I don't consider it an error to have two album's with the same name. Regardless, I don't want uploads to randomly fail either. Am thinking the solution here is that I need to make sure that the album is created before the first upload, and maybe even reference it in the POST request by id rather then name. Example 3: Creating new user. User puts in a request for an account. Administrator has to approve the request. If two administrators approve the same request at the same time, we could end up with two accounts for the same user. Ooops. Or an error if some unique index caught, say, the duplicate username or email address. I guess I really to think about minimize the risks, as opposed to total extermination of all possible race conditions. Instead focus on ensuring that the database integrity and that possible damage (e.g. duplicate records) is minimised.
_______________________________________________ melbourne-pug mailing list [email protected] https://mail.python.org/mailman/listinfo/melbourne-pug
