Hi guys, I spent literally *months* trying to find the best way to resolve this situation. On our high performance sites + backends (around 120qps, 50/50 split between read and write) we were getting IntegrityError raised by get_or_create (which we now refer to as Object Collisions).
We are able to trigger the issue by having two scripts attempt to get_or_create() a new object at the same time (it takes a few loop iterations until it reproduces itself). When get_or_create() is called, it'll raise a duplicate key exception, but if you then attempt to do a .get() on the row using the same args/kwargs straight away afterwards, you'll notice it returns an empty result. This (we think) is because when you create a row within a transaction, although the row isn't there, it will still raise a duplicate key error until the row is either commited or rolled back. Eventually, we created a modified get_or_create() which *reduce* the chance of it happening: http://pastebin.com/X1nvk0uq However, we are still left with a final problem, which is we either put a sleep() in the code (eww!) to increase the chance of a successful query, or you raise an ObjectNotReady exception (custom), and then add logic into your code to deal with this occurrence. The other option is that you re-factor your code so that the time between commits on large chunks of code are as small as possible, reducing the chance of this error. But, from what we can tell, you can only "reduce" the chance of raising an ObjectNotReady, you can't get rid of it entirely. (unless you get rid of transactions entirely lol) 2011/5/8 Tomasz Zielinski <[email protected]> > W dniu sobota, 7 maja 2011, 20:03:40 UTC+2 użytkownik David Cramer napisał: > >> Would it help (in PG-world) at least if the selects where in >> savepoints as well? >> > > In case you asked me - I don't know, I don't use PG on a daily basis... > > > Anyway, let me list some possible solutions to launch the discussion: > 1. Leave it as is, and update the documentation to warn users that > get_or_create is problematic for concurrent writes unless used in READ > COMMITED isolation level > 2. Switch to my code and update the documentation to warn users > that get_or_create commits the open transaction > 3. Combine 1&2 and somehow allow two modes of operation for get_or_create, > either based on a flag, or based on retrieved isolation level - or maybe add > some get_or_create2 ? > 4. Remove get_or_create, because it cannot be fixed to gracefully handle > all common cases > > Re savepoint-based solution, I don't list it because I don't "feel" how it > should work. > > Tomasz > > -- > You received this message because you are subscribed to the Google Groups > "Django developers" group. > To post to this group, send email to [email protected]. > To unsubscribe from this group, send email to > [email protected]. > For more options, visit this group at > http://groups.google.com/group/django-developers?hl=en. > -- You received this message because you are subscribed to the Google Groups "Django developers" group. To post to this group, send email to [email protected]. To unsubscribe from this group, send email to [email protected]. For more options, visit this group at http://groups.google.com/group/django-developers?hl=en.
