#16554: Unnecessary join when using a reverse foreign-key relationship in 
separate
filter or aggregate calls
------------------------+----------------------------------------------
 Reporter:  bendavis78  |          Owner:  nobody
     Type:  Bug         |         Status:  new
Milestone:              |      Component:  Database layer (models, ORM)
  Version:  SVN         |       Severity:  Normal
 Keywords:              |   Triage Stage:  Unreviewed
Has patch:  0           |  Easy pickings:  0
    UI/UX:  0           |
------------------------+----------------------------------------------
 Django allows you to perform queries across reverse foreign key
 relationships.  If, however, you need to access that same relationship in
 more than one filter or aggregate call, the ORM creates unnecessary joins.

 == Example ==

 app/models.py:
 {{{#!python
 class Author(models.Model):
     first_name = models.CharField(max_length=30)
     last_name = models.CharField(max_length=30)
     email = models.EmailField()

 class Book(models.Model):
     author = models.ForeignKey(Author, related_name='books')
     title = models.CharField(max_length=100)
     genre = models.CharField(max_length=20, choices=(
         ('SCIFI', 'SciFi'),
         ('FANTASY', 'Fantasy'),
         ('NONFICTION', 'NonFiction')
     ))
     published = models.DateField()
     pages = models.IntegerField()
 }}}

 This particular query will only perform one join on app_books:
 {{{#!python
 qs = Author.objects.filter(books__genre='SCIFI', books__pages__gt=500)
 print qs.query
 }}}
 {{{#!sql
 SELECT "app_author"."id", "app_author"."first_name",
 "app_author"."last_name", "app_author"."email"
 FROM "app_author"
 INNER JOIN "app_book" ON ("app_author"."id" = "app_book"."author_id")
 WHERE ("app_book"."genre" = 'SCIFI'  AND "app_book"."pages" > 500 )
 }}}

 However, if you separate the filter() call into two separate calls, you
 get this:
 {{{#!python
 qs = Author.objects.filter(books__genre='SCIFI')
 qs = qs.filter(books__pages__gt=500)
 print qs.query
 }}}
 {{{#!sql
 SELECT "app_author"."id", "app_author"."first_name",
 "app_author"."last_name", "app_author"."email"
 FROM "app_author"
 INNER JOIN "app_book" ON ("app_author"."id" = "app_book"."author_id")
 INNER JOIN "app_book" T3 ON ("app_author"."id" = T3."author_id")
 WHERE ("app_book"."genre" = 'SCIFI'  AND T3."pages" > 500 )
 }}}
 As you can see, simply separating out the filters into separate calls
 creates a new, completely unecessary join.

 This also occurs if you need to do an aggregate:
 {{{#!python
 qs = Author.objects.annotate(pages_written=Sum('books__pages'))
 qs = qs.filter(books__genre='SCIFI')
 print qs.query
 }}}
 {{{#!sql
 SELECT "app_author"."id", "app_author"."first_name",
 "app_author"."last_name", "app_author"."email", SUM("app_book"."pages") AS
 "pages_written"
 FROM "app_author"
 LEFT OUTER JOIN "app_book" ON ("app_author"."id" = "app_book"."author_id")
 INNER JOIN "app_book" T3 ON ("app_author"."id" = T3."author_id")
 WHERE T3."genre" = 'SCIFI'
 GROUP BY "app_author"."id", "app_author"."first_name",
 "app_author"."last_name", "app_author"."email"
 }}}
 In this particular case, I imagine the ORM is seeing the need for two
 joins because one is outer and one is inner, however, I would argue that
 if an outer join already occurs as a result of an aggregate, it should
 always use the existing join.

-- 
Ticket URL: <https://code.djangoproject.com/ticket/16554>
Django <https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.

-- 
You received this message because you are subscribed to the Google Groups 
"Django updates" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/django-updates?hl=en.

Reply via email to