#18557: get_or_create() causes a race condition with MySQL
-------------------------------+--------------------------------------
     Reporter:  foxwhisper     |                    Owner:  nobody
         Type:  Uncategorized  |                   Status:  reopened
    Component:  Core (URLs)    |                  Version:  1.4
     Severity:  Normal         |               Resolution:
     Keywords:                 |             Triage Stage:  Unreviewed
    Has patch:  0              |      Needs documentation:  0
  Needs tests:  0              |  Patch needs improvement:  0
Easy pickings:  0              |                    UI/UX:  0
-------------------------------+--------------------------------------
Changes (by foxwhisper):

 * status:  closed => reopened
 * resolution:  wontfix =>


Comment:

 @akaariai You raise a very good point about it breaking transaction
 control, and enabling such a flag would depend on the developer ensuring
 they knew exactly what they were doing, and when it is safe to use it.

 I'm thinking something along these lines:

 {{{
 MyModel.objects.get_or_create(a=1, b=2, force_commit=True)
 }}}

 Along with a documentation update that says:

 {{{
 Certain databases (such as MySQL) don't gracefully handle get_or_create()
 when multiple threads are being used to write
 to the same table. If throughput is high enough, then there is a small
 race condition where the MySQL index says the
 unique index exists, but any attempt to fetch that key will result in
 failure.

 The only way around this is to commit the transaction you are in, which
 then allows you to fetch the row. However, if
 your get_or_create() is in a transaction block with manual commits, then
 any queries before the get_or_create() call
 will also be committed.

 If you plan on using this feature, you must ensure that the
 get_or_create() call is within a safe context where it is
 okay for the previous queries to be committed.
 }}}

 Also, to touch on your question about what kind of usage pattern leads to
 this race condition, it's fairly easy to trigger. You just need two or
 more threads attempting to perform get_or_create() on the same table,
 within a close space of each other. A typical scenario could be a queued
 import job which has to do a get_or_create() on a popular item such as IP
 address. If both scripts encounter the same IP at the same time, it will
 cause the race condition to happen. The reason this affected us so badly,
 is because the majority of our work involves importing and mangling large
 data sets - where as a low traffic site would almost never see this
 happen.

 If the above suggested patch description sounds good, please let me know
 and I'll get a patch prepared.

-- 
Ticket URL: <https://code.djangoproject.com/ticket/18557#comment:2>
Django <https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.

-- 
You received this message because you are subscribed to the Google Groups 
"Django updates" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/django-updates?hl=en.

Reply via email to