#34262: Queryset grouped by annotation with aggregates on another annotated
expression crashes on MySQL with sql_mode=only_full_group_by.
-------------------------------------+-------------------------------------
Reporter: Mariusz Felisiak | Owner: nobody
Type: Bug | Status: new
Component: Database layer | Version: 4.1
(models, ORM) |
Severity: Normal | Resolution:
Keywords: mysql | Triage Stage: Accepted
only_full_group_by |
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by David Wobrock):
* cc: David Wobrock (added)
Comment:
Hey there,
Took a look at what is happening and why MySQL is failing with
`ONLY_FULL_GROUP_BY`.
In short and simplified, this statement works:
{{{
mysql> SELECT GREATEST(pages, 600), MIN(pages) FROM aggregation_book GROUP
BY GREATEST(pages, 600) ORDER BY NULL;
+----------------------+------------+
| GREATEST(pages, 600) | MIN(pages) |
+----------------------+------------+
| 600 | 300 |
| 1132 | 1132 |
| 946 | 946 |
+----------------------+------------+
3 rows in set (0,01 sec)
}}}
And when you try to add a third expression, that uses the two first:
{{{
mysql> SELECT GREATEST(pages, 600), MIN(pages), LEAST(MIN(pages),
GREATEST(pages, 600)) AS least FROM aggregation_book GROUP BY
GREATEST(pages, 600) ORDER BY NULL;
ERROR 1055 (42000): Expression #3 of SELECT list is not in GROUP BY clause
and contains nonaggregated column
'test_django_tests.aggregation_book2.pages' which is not functionally
dependent on columns in GROUP BY clause; this is incompatible with
sql_mode=only_full_group_by
}}}
MySQL is not happy, even though it seems rather straightforward that this
query should work.
The resources I could find on the topic are here:
- [https://bugs.mysql.com/bug.php?id=90792 A ticket from the MySQL bug
tracker]
- [https://stackoverflow.com/q/49447622 A related StackOverflow thread]
- [https://dev.mysql.com/blog-archive/when-only_full_group_by-wont-see-
the-query-is-deterministic/ A MySQL blog post]
And the blog post explains in more depth why it's not working.
-------
That leaves us with a choice to make for Django's behavior I reckon :)
Some options:
== 1. Add an `ANY_VALUE` around the problematic expression
That solves the issue here for instance:
{{{
mysql> SELECT GREATEST(pages, 600), MIN(pages),
ANY_VALUE(LEAST(MIN(pages), GREATEST(pages, 600))) AS least FROM
aggregation_book2 GROUP BY GREATEST(pages, 600) ORDER BY NULL;
+----------------------+------------+-------+
| GREATEST(pages, 600) | MIN(pages) | least |
+----------------------+------------+-------+
| 600 | 300 | 300 |
| 1132 | 1132 | 1132 |
| 946 | 946 | 946 |
+----------------------+------------+-------+
3 rows in set (0,00 sec)
}}}
However, I fear that detecting when to wrap the expression with an
`ANY_VALUE` is a rabbit hole we don't want to go down, as we might end up
implementing what the MySQL team didn't want to implement.
== 2. Raise awareness
We could, firstly, document the potential issue, and secondly raise a
warning when such an error occurs when executing a query a Django.
That way, users are at least aware that's not entirely their or Django's
fault.
== 3. Generally change query generation
Another type of workaround suggested by the [https://dev.mysql.com/blog-
archive/when-only_full_group_by-wont-see-the-query-is-deterministic/ MySQL
blog post] is to use a subquery/derived table:
{{{
mysql> SELECT greatest_pages,
MIN(pages),
LEAST(MIN(pages), greatest_pages) AS least
FROM (SELECT GREATEST(pages, 600) greatest_pages,
pages
FROM aggregation_book2) AS t
GROUP BY greatest_pages
ORDER BY NULL;
+----------------+------------+-------+
| greatest_pages | MIN(pages) | least |
+----------------+------------+-------+
| 600 | 300 | 300 |
| 1132 | 1132 | 1132 |
| 946 | 946 | 946 |
+----------------+------------+-------+
3 rows in set (0,00 sec)
}}}
So that we always try to group on a column, and not an expression.
Even though, it might be worse in terms of performances, depending the DB
implementation I guess.
This change would then affect all databases I reckon, which is a much
larger change, and therefore riskier.
== 4. Any other option! :D
I hope all of this makes sense, happy to read any thoughts on this :)
See ya!
--
Ticket URL: <https://code.djangoproject.com/ticket/34262#comment:3>
Django <https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.
--
You received this message because you are subscribed to the Google Groups
"Django updates" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To view this discussion on the web visit
https://groups.google.com/d/msgid/django-updates/01070185e5c7f547-fb2a25f0-2073-465a-9c97-0ed1c3664e46-000000%40eu-central-1.amazonses.com.