#31005: Incorrect GROUP BY aggregate values when filtering by annotation on
reverse-related LEFT OUTER JOIN
-------------------------------------+-------------------------------------
               Reporter:  Oroku      |          Owner:  nobody
  Saki                               |
                   Type:             |         Status:  new
  Uncategorized                      |
              Component:  Database   |        Version:  1.11
  layer (models, ORM)                |
               Severity:  Normal     |       Keywords:
           Triage Stage:             |      Has patch:  0
  Unreviewed                         |
    Needs documentation:  0          |    Needs tests:  0
Patch needs improvement:  0          |  Easy pickings:  0
                  UI/UX:  0          |
-------------------------------------+-------------------------------------
 Given the following models:

 {{{
 #!python
 from django.db import models


 class Account(models.Model):
     name = models.CharField(max_length=255)


 class Ticket(models.Model):
     account = models.ForeignKey(Account)


 class StatusChange(Model):
     ticket = models.ForeignKey(Ticket)
     created_on = models.DateField(auto_now_add=True)
 }}}

 Create some objects:

 {{{
 #!python
 a = Account.objects.create(name='One')
 t = a.ticket_set.create()
 t.statuschange_set.create()
 t.statuschange_set.create()
 a = Account.objects.create(name='Two')
 t = a.ticket_set.create()
 t.statuschange_set.create()
 t.statuschange_set.create()
 t.statuschange_set.create()
 }}}

 To summarize, you've created 2 Accounts, each with 1 Ticket, and each
 Ticket with 2 and 3 StatusChange records per Ticket, respectively.

 Now, query for some reporting info:

 {{{
 #!python
 from datetime import timedelta

 from django.utils import timezone
 from django.db.models import Max, Count


 # Make a Ticket queryset annotated with the last update to a related
 StatusChange
 qs = Ticket.objects.annotate(
     updated_at=Max(
         'statuschange__created_on',
         output_field=models.DateField()
     )
 )

 # Filter by updated_at (just filter by < now to make sure no records are
 filtered out)
 qs = qs.filter(updated_at__lte=timezone.now())

 # Now, count the rows, grouped by Account name
 qs.order_by('account__name').values('account__name').annotate(Count('pk'))
 }}}

 **The expected output is:**

 {{{
 #!python
 <QuerySet [{'account__name': 'One', 'pk__count': 1}, {'account__name':
 'Two', 'pk__count': 1},]>
 }}}

 Since we know that there is just 1 Ticket per Account.

 **Instead, I'm seeing:**

 {{{
 #!python
 <QuerySet [{'account__name': 'One', 'pk__count': 2}, {'account__name':
 'Two', 'pk__count': 3},]>
 }}}

 Because the {{{LEFT OUTER JOIN}}} in the query is causing the second
 {{{Count}}} annotation to count once per outer-related row.

-- 
Ticket URL: <https://code.djangoproject.com/ticket/31005>
Django <https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.

-- 
You received this message because you are subscribed to the Google Groups 
"Django updates" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-updates/052.64da59bd979585a157f7d155e5ebe9d3%40djangoproject.com.

Reply via email to