#24218: Use sub-query in ORM when distinct and order_by columns do not match
-------------------------------------+-------------------------------------
     Reporter:  miki725              |      Owner:  nobody
         Type:  New feature          |     Status:  new
    Component:  Database layer       |    Version:  master
  (models, ORM)                      |   Keywords:  subquery distinct
     Severity:  Normal               |  order_by
 Triage Stage:  Unreviewed           |  Has patch:  1
Easy pickings:  0                    |      UI/UX:  0
-------------------------------------+-------------------------------------
 This ticket is to propose a slight change in ORM - use subqueries when
 querying a model where {{{.distinct()}}} and {{{.order_by()}}} (or
 {{{.extra(order_by=())}}}) leftmost columns do not match. For example:

 {{{#!python
 Model.objects.all().distinct('foo').order_by('bar')
 }}}

 The above generates the following SQL:

 {{{#!sql
 SELECT DISTINCT ON ("app_model"."foo") <lots of columns here>
 FROM "app_model"
 ORDER BY "app_model"."bar" ASC;
 }}}

 I am not sure about all backends however the above syntax is not allowed
 in PostgreSQL which produces the following error:

 {{{#!sql
 ERROR:  SELECT DISTINCT ON expressions must match initial ORDER BY
 expressions
 }}}

 Here are [http://www.postgresql.org/docs/9.0/static/sql-select.html#SQL-
 DISTINCT PostgreSQL docs] explaining why that is not allowed:

   DISTINCT ON ( expression [, ...] ) keeps only the first row of each set
 of rows where the given expressions evaluate to equal. [...] Note that the
 "first row" of each set is unpredictable unless ORDER BY is used to ensure
 that the desired row appears first. [...] The DISTINCT ON expression(s)
 must match the leftmost ORDER BY expression(s).

 This ticket proposes to use subqueries in such situations which would use
 SQL:

 {{{#!sql
 SELECT *
 FROM (
   SELECT DISTINCT ON ("app_model"."foo") <lots of columns here>
   FROM "app_model"
 ) result
 ORDER BY "app_model"."bar" ASC;
 }}}

 The above is perfectly valid SQL and produces expected results (please
 note that {{{ORDER_BY}}} is in the outer query to guarantee that distinct
 results are correctly sorted).

 I created a simple [https://gist.github.com/miki725/ce26d8b6ee2f3075884a
 patch] by overwriting few things in {{{SQLCompiler.as_sql()}}} which seems
 to work pretty well. The patch only creates subquery when the above dilema
 is encountered which should not have any negative side-effects on existing
 queries (since such queries were not allowed by SQL). The patch also works
 on the {{{.count()}}} queries since Django then strips any ordering hence
 the subquery is never created.

--
Ticket URL: <https://code.djangoproject.com/ticket/24218>
Django <https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.

-- 
You received this message because you are subscribed to the Google Groups 
"Django updates" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-updates/050.8eba4d92c44158f14b9fe7739eab5013%40djangoproject.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to