Example 3 can be fixed by making administrator approval idempotent. User puts in a request for an account, and in doing so creates a unique "I want an account" token. ID of this token is a hash of the associated email, perhaps. Then, it doesn't matter if two admistrators authorize it in sequence, the second authorization will leave the already authorized token untouched.
I'm not so sure about example 1, but perhaps it could be done in a similar way. Create a cached object that has some kind of unique key, and always create it empty and with a flag that says "busy being born". That way the race can be minimised, as this creation takes very little time (though not eliminated, as it's not an atomic operation). Requests that find the cache object while it's still being populated can perfrorm a wait(), or maybe register themselves to be notified via callback when the object is finished. J On Fri, Oct 18, 2013 at 12:05 PM, Brian May <[email protected]> wrote: > Perhaps I need to give some some specific examples. > > > Example 1: we need to display the user some data, however this is slow, so > we try to cache it: > > try: > cache = Cache.objects.get(start=?, stop=?) > except Model.DoesNotExist: > data = get_data(start=?,stop=?) > cache = Cache.objects.create(start=?, stop=?, xxx=data.xxx, yyy=data.yyy, > ...) > [ render response using cache ] > > So the first step I can do is make sure start and stop are uniquely indexed. > That way if it is run concurrently, the other processes will fail rather > then create multiple objects resulting in every request failing. Still not > very good from the user's perspective. > > Ideally, as get_data is a db intensive operation I only want to call it once > for a given start/stop. Otherwise we use more resources then required. Also > I risk being vulnerable to DOS attacks if I get a lot of requests at the > same time (you could argue this is a problem anyway as the start and stop > come from the user). > > I think I could change that to something like (if I understand celery > correctly): > > from app.tasks import get_data > try: > cache = Cache.objects.get(start=?, stop=?) > except Model.DoesNotExist: > cache = Cache.objects.create(start=?, stop=?) > cache.task = get_data.delay() > cache.save() > # cache.xxx and cache.yyy to be filled in by celery task > > if cache.task is not None and not task.ready(): > [ render processing message ] > else: > [ render response using cache ] > > However, unfortunately, I still have the same race condition. > > > Example 2: I have a photo database that accepts imports from JavaScript. The > JavaScript will send a POST request for every file to be uploaded, with the > randomly generated name of the album to upload the photo to. At the first > step it does: > > Album.objects.get_or_create(name=?) > > There is an issue I haven't investigated yet with the JavaScript that for > the first upload it will upload the first two files concurrently, despite > the fact I configured it to only allow one at a time. Regardless, being able > to support concurrent uploads is probably a desirable feature. > > I can't create a unique index here on name, I don't consider it an error to > have two album's with the same name. > > Regardless, I don't want uploads to randomly fail either. > > Am thinking the solution here is that I need to make sure that the album is > created before the first upload, and maybe even reference it in the POST > request by id rather then name. > > > Example 3: Creating new user. User puts in a request for an account. > Administrator has to approve the request. If two administrators approve the > same request at the same time, we could end up with two accounts for the > same user. Ooops. Or an error if some unique index caught, say, the > duplicate username or email address. > > > I guess I really to think about minimize the risks, as opposed to total > extermination of all possible race conditions. Instead focus on ensuring > that the database integrity and that possible damage (e.g. duplicate > records) is minimised. > > _______________________________________________ > melbourne-pug mailing list > [email protected] > https://mail.python.org/mailman/listinfo/melbourne-pug > _______________________________________________ melbourne-pug mailing list [email protected] https://mail.python.org/mailman/listinfo/melbourne-pug
