#24385: Sum() doesn't seem to respect .distinct() in the queryset filter before
.aggregate() when filtering by m2m.
-------------------------------------+-------------------------------------
Reporter: mcagl | Owner: nobody
Type: Uncategorized | Status: new
Component: Database layer | Version: 1.7
(models, ORM) |
Severity: Normal | Resolution:
Keywords: | Triage Stage:
| Unreviewed
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Comment (by mcagl):
It seems similar, or at least somewhat related.
I did a little more inspecting, doing this in {{{./manage.py shell}}} to
obtain the executed raw SQL statements:
{{{
import logging
l = logging.getLogger('django.db.backends')
l.setLevel(logging.DEBUG)
l.addHandler(logging.StreamHandler())
import datetime
from decimal import Decimal
import my_app.models as app_models
t0 = app_models.Tag.objects.create(name="Tag0")
t1 = app_models.Tag.objects.create(name="Tag1")
t2 = app_models.Tag.objects.create(name="Tag2")
t3 = app_models.Tag.objects.create(name="Tag3")
app_models.Row.objects.create(day=datetime.date(2014, 1, 1),
amount=Decimal('10.00'))
r = app_models.Row.objects.create(day=datetime.date(2014, 1, 2),
amount=Decimal('10.00'))
r.tags.add(t0)
r = app_models.Row.objects.create(day=datetime.date(2014, 1, 3),
amount=Decimal('10.00'))
r.tags.add(t1)
r = app_models.Row.objects.create(day=datetime.date(2014, 1, 4),
amount=Decimal('10.00'))
r.tags.add(t0, t1)
r = app_models.Row.objects.create(day=datetime.date(2014, 1, 5),
amount=Decimal('10.00'))
r.tags.add(t2, t3)
}}}
so mainly adding a DEBUG level logger for SQL queries, and recreating the
objects in the setUp() method of the TestCase in the github repository.
Then I tried the queries, obtaining this output, first without
{{{distinct()}}}:
{{{
rows = app_models.Row.objects.filter(tags__in=[t0, t1]
list(rows)
}}}
{{{
>>> (0.000) QUERY = u'SELECT "my_app_row"."id", "my_app_row"."amount",
"my_app_row"."day" FROM "my_app_row" INNER JOIN "my_app_row_tags" ON (
"my_app_row"."id" = "my_app_row_tags"."row_id" ) WHERE
"my_app_row_tags"."tag_id" IN (%s, %s)' - PARAMS = (1, 2); args=(1, 2)
>>> [<Row: Row object>, <Row: Row object>, <Row: Row object>, <Row: Row
object>]
}}}
{{{
agg = rows.aggregate(Sum('amount'))
print agg['amount__sum']
}}}
{{{
>>> (0.000) QUERY = u'SELECT SUM("my_app_row"."amount") AS "amount__sum"
FROM "my_app_row" INNER JOIN "my_app_row_tags" ON ( "my_app_row"."id" =
"my_app_row_tags"."row_id" ) WHERE "my_app_row_tags"."tag_id" IN (%s, %s)'
- PARAMS = (1, 2); args=(1, 2)
>>> 40.00
}}}
Then I added {{{distinct()}}}, which gets added as expected (and the rows
queryset gets 3 objects instead of 4, also expected):
{{{
rows = app_models.Row.objects.filter(tags__in=[t0, t1]).distinct()
list(rows)
}}}
{{{
>>> (0.000) QUERY = u'SELECT DISTINCT "my_app_row"."id",
"my_app_row"."amount", "my_app_row"."day" FROM "my_app_row" INNER JOIN
"my_app_row_tags" ON ( "my_app_row"."id" = "my_app_row_tags"."row_id" )
WHERE "my_app_row_tags"."tag_id" IN (%s, %s)' - PARAMS = (1, 2); args=(1,
2)
>>> [<Row: Row object>, <Row: Row object>, <Row: Row object>]
}}}
but somewhat fails when doint aggregation in this version of the queryset,
because the {{{DISTINCT}}} is added before {{{SUM}}}, whereas I expect
that {{{distinct()}}} gets executed before doing the aggregation, as it's
the python code I wrote:
{{{
agg = rows.aggregate(Sum('amount'))
print agg['amount__sum']
}}}
{{{
>>> (0.000) QUERY = u'SELECT DISTINCT SUM("my_app_row"."amount") AS
"amount__sum" FROM "my_app_row" INNER JOIN "my_app_row_tags" ON (
"my_app_row"."id" = "my_app_row_tags"."row_id" ) WHERE
"my_app_row_tags"."tag_id" IN (%s, %s)' - PARAMS = (1, 2); args=(1, 2)
>>> 40.00
}}}
And obviously {{{agg['amount__sum']}}} contains an unexpected value,
because at an higher level it's like the Row object with two tags is
counted twice, completely disregarding the {{{distinct()}}} method I
called before aggregating.
Thanks for the quick response, I hope that this can be useful,
Mark
--
Ticket URL: <https://code.djangoproject.com/ticket/24385#comment:3>
Django <https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.
--
You received this message because you are subscribed to the Google Groups
"Django updates" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To post to this group, send email to [email protected].
To view this discussion on the web visit
https://groups.google.com/d/msgid/django-updates/063.b522644ff0cac109cdf89b40581b1ae1%40djangoproject.com.
For more options, visit https://groups.google.com/d/optout.