#22438: Slow INNER JOIN in MySQL can be fixed in Django ORM, but should it?
-------------------------------------+-------------------------------------
     Reporter:  frol                 |      Owner:  nobody
         Type:  Uncategorized        |     Status:  new
    Component:  Database layer       |    Version:  master
  (models, ORM)                      |   Keywords:  mysql, orm, slow query,
     Severity:  Normal               |  join
 Triage Stage:  Unreviewed           |  Has patch:  0
Easy pickings:  0                    |      UI/UX:  0
-------------------------------------+-------------------------------------
 There is the bug in MySQL INNER JOIN optimization, but it can be fixed in
 Django ORM by replacing INNER JOIN with STRAIGHT_JOIN. In short, ordered
 inner join selections take 3+ seconds, where if replace INNER JOIN with
 STRAIGHT_JOIN, we get result in 0.001.

 Here are steps to reproduce:
 1) We need a model with ForeignKey:
 {{{
 from django.conf import settings
 from django.db import models

 class Child(models.Model):
     name = models.CharField("Name", max_length=255)
     owner = models.ForeignKey(settings.AUTH_USER_MODEL)
 }}}

 2) Initialize data:
 {{{
 from random import choice
 from django.contrib.auth import get_user_model
 from qq.models import *

 User = get_user_model()
 users = User.objects.all()
 Child.objects.bulk_create(Child(name='child_%d' % i, owner=choice(users))
 for i in xrange(400000))
 }}}

 3) Query data with join (the bug appears only if order by is applied):
 {{{
 Child.objects.all().order_by('-id').select_related('owner')[:2]
 }}}
 The resulting query would be:
 {{{
  {u'sql': u'SELECT `qq_child`.`id`, `qq_child`.`name`,
 `qq_child`.`other_field`, `qq_child`.`owner_id`,
 `auth_user`.`id`, `auth_user`.`password`, `auth_user`.`last_login`,
 `auth_user`.`is_superuser`,
 `auth_user`.`username`, `auth_user`.`first_name`, `auth_user`.`last_name`,
 `auth_user`.`email`,
 `auth_user`.`is_staff`, `auth_user`.`is_active`, `auth_user`.`date_joined`
 FROM `qq_child`
 INNER JOIN `auth_user` ON ( `qq_child`.`owner_id` = `auth_user`.`id` )
 ORDER BY `qq_child`.`id` ASC
 LIMIT 2',
   u'time': u'4.608'},
 }}}

 (MySQL caches result until you update the table so the next time it would
 take 0.001, but not in production if your table updates frequently)

 If I replace INNER JOIN with STRAIGHT_JOIN:
 {{{
 list(Child.objects.raw(
     Child.objects.all().order_by('-id').select_related('owner')[:2]\
         .query.sql_with_params()[0].replace('INNER JOIN', 'STRAIGHT_JOIN')
 ))
 }}}

 I get this result:
 {{{
  {u'sql': u'SELECT `qq_child`.`id`, `qq_child`.`name`,
 `qq_child`.`other_field`, `qq_child`.`owner_id`,
 `auth_user`.`id`, `auth_user`.`password`, `auth_user`.`last_login`,
 `auth_user`.`is_superuser`,
 `auth_user`.`username`, `auth_user`.`first_name`, `auth_user`.`last_name`,
 `auth_user`.`email`,
 `auth_user`.`is_staff`, `auth_user`.`is_active`, `auth_user`.`date_joined`
 FROM `qq_child`
 STRAIGHT_JOIN `auth_user` ON ( `qq_child`.`owner_id` = `auth_user`.`id` )
 ORDER BY `qq_child`.`id` DESC
 LIMIT 2',
   u'time': u'0.001'}]
 }}}

 '''4.6 seconds vs 0.001. Really?'''

 '''Solutions:'''
 1) SQLAlchemy implemented .with_prefix() method for query:
 http://stackoverflow.com/a/16743228
 2) Replace INNER JOIN for MySQL backend with STRAIGHT_JOIN, but it might
 be not a good idea. I will investigate this today.

 '''Wrong attempts to fix this:'''
 Since select_related is used in django admin change_list if you ask for
 related fields in columns, devs try to override it with .raw() -
 http://stackoverflow.com/a/15978732/1178806 (I answered in comments why it
 doesn't work).

-- 
Ticket URL: <https://code.djangoproject.com/ticket/22438>
Django <https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.

-- 
You received this message because you are subscribed to the Google Groups 
"Django updates" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-updates+unsubscr...@googlegroups.com.
To post to this group, send email to django-updates@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-updates/047.fb4a41896834fa58bf28eba502eefc05%40djangoproject.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to