#24218: Use sub-query in ORM when distinct and order_by columns do not match
-------------------------------------+-------------------------------------
Reporter: miki725 | Owner: nobody
Type: New feature | Status: new
Component: Database layer | Version: master
(models, ORM) | Keywords: subquery distinct
Severity: Normal | order_by
Triage Stage: Unreviewed | Has patch: 1
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
This ticket is to propose a slight change in ORM - use subqueries when
querying a model where {{{.distinct()}}} and {{{.order_by()}}} (or
{{{.extra(order_by=())}}}) leftmost columns do not match. For example:
{{{#!python
Model.objects.all().distinct('foo').order_by('bar')
}}}
The above generates the following SQL:
{{{#!sql
SELECT DISTINCT ON ("app_model"."foo") <lots of columns here>
FROM "app_model"
ORDER BY "app_model"."bar" ASC;
}}}
I am not sure about all backends however the above syntax is not allowed
in PostgreSQL which produces the following error:
{{{#!sql
ERROR: SELECT DISTINCT ON expressions must match initial ORDER BY
expressions
}}}
Here are [http://www.postgresql.org/docs/9.0/static/sql-select.html#SQL-
DISTINCT PostgreSQL docs] explaining why that is not allowed:
DISTINCT ON ( expression [, ...] ) keeps only the first row of each set
of rows where the given expressions evaluate to equal. [...] Note that the
"first row" of each set is unpredictable unless ORDER BY is used to ensure
that the desired row appears first. [...] The DISTINCT ON expression(s)
must match the leftmost ORDER BY expression(s).
This ticket proposes to use subqueries in such situations which would use
SQL:
{{{#!sql
SELECT *
FROM (
SELECT DISTINCT ON ("app_model"."foo") <lots of columns here>
FROM "app_model"
) result
ORDER BY "app_model"."bar" ASC;
}}}
The above is perfectly valid SQL and produces expected results (please
note that {{{ORDER_BY}}} is in the outer query to guarantee that distinct
results are correctly sorted).
I created a simple [https://gist.github.com/miki725/ce26d8b6ee2f3075884a
patch] by overwriting few things in {{{SQLCompiler.as_sql()}}} which seems
to work pretty well. The patch only creates subquery when the above dilema
is encountered which should not have any negative side-effects on existing
queries (since such queries were not allowed by SQL). The patch also works
on the {{{.count()}}} queries since Django then strips any ordering hence
the subquery is never created.
--
Ticket URL: <https://code.djangoproject.com/ticket/24218>
Django <https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.
--
You received this message because you are subscribed to the Google Groups
"Django updates" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To post to this group, send email to [email protected].
To view this discussion on the web visit
https://groups.google.com/d/msgid/django-updates/050.8eba4d92c44158f14b9fe7739eab5013%40djangoproject.com.
For more options, visit https://groups.google.com/d/optout.