I'm not sure how much QuerySet optimization django should really be doing.
Databases' optimizers are really good and as we've seen there are a huge
number of edge cases in trying to do the work in Djangoland, as QuerySets
don't really correspond to the underlying data model.
Here's an example in MySQL 5.7 where a subselect with an expression that
looks like a django annotation is being optimized away:
adam@localhost [7]> show create table t;
+-------+---------------------------------------------------
-----------------------------+
| Table | Create Table
|
+-------+---------------------------------------------------
-----------------------------+
| t | CREATE TABLE `t` (
`a` text,
`b` text
) ENGINE=InnoDB DEFAULT CHARSET=utf8 |
+-------+---------------------------------------------------
-----------------------------+
1 row in set (0.00 sec)
adam@localhost [8]> explain select count(*) from (select char_length(a)
from t) tt;
+----+-------------+-------+------------+------+------------
---+------+---------+------+------+----------+-------+
| id | select_type | table | partitions | type | possible_keys | key |
key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+------+------------
---+------+---------+------+------+----------+-------+
| 1 | SIMPLE | t | NULL | ALL | NULL | NULL |
NULL | NULL | 2 | 100.00 | NULL |
+----+-------------+-------+------------+------+------------
---+------+---------+------+------+----------+-------+
1 row in set, 1 warning (0.00 sec)
adam@localhost [9]> show warnings;
+-------+------+--------------------------------------------
------------------+
| Level | Code | Message
|
+-------+------+--------------------------------------------
------------------+
| Note | 1003 | /* select#1 */ select count(0) AS `count(*)` from
`test`.`t` |
+-------+------+--------------------------------------------
------------------+
1 row in set (0.00 sec)
The last note is the rewritten SQL - as you can see it figured the
char_length(a) and subselect were unnecessary and removed them. Most
databases will have these kind of optimizations in place.
Tom, have you got the SQL of problematic counts you've experienced in the
past? It might be easier to fix this at a different level, e.g. the SQL
that Django generates, that try adding an optimization "layer".
On 21 August 2017 at 12:14, Anssi Kääriäinen <[email protected]> wrote:
> On Monday, August 21, 2017 at 11:28:35 AM UTC+3, Tom Forbes wrote:
>>
>> > Interestingly enough, just doing a .filter(m2m_relation__foo='bar')
>> might change the results
>>
>> Is that not because it's a filter? Would ".
>> annotate(x=F('m2m_relation__foo'))"
>> change the results in any way? Sorry if I'm not following.
>>
>
> What's happening here is that Django generates an unconstrained join to
> the m2m_relation. Unsurprisingly that generates duplicate rows to the
> queryset. For example:
>
> class Publisher:
> name = CharField()
>
> class Book:
> publisher = ForeignKey(Publisher, related_name='books')
>
> print(Publisher.objects.annotate(book_id=F('books__id')).query)
> OUT: SELECT publisher.id, book.id as book_id FROM publisher JOIN book ON
> book.publisher_id = publisher.id
>
> Now, if there are two books fo any publisher, that publisher will be
> outputted twice in the resulting queryset.
>
> Normally you wouldn't want to do this, but even if you do this by
> accident, count() and len(qs) should return the same number. And, for
> .values() querysets this can be actually useful in some cases.
>
> Note that it's possible to remove the select entry for the annotation,
> even if the join can't be removed in this case.
>
> - Anssi
>
> --
> You received this message because you are subscribed to the Google Groups
> "Django developers (Contributions to Django itself)" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> To post to this group, send email to [email protected].
> Visit this group at https://groups.google.com/group/django-developers.
> To view this discussion on the web visit https://groups.google.com/d/ms
> gid/django-developers/d9f2cded-ef37-4852-8845-1fcc126457ee%
> 40googlegroups.com
> <https://groups.google.com/d/msgid/django-developers/d9f2cded-ef37-4852-8845-1fcc126457ee%40googlegroups.com?utm_medium=email&utm_source=footer>
> .
>
> For more options, visit https://groups.google.com/d/optout.
>
--
Adam
--
You received this message because you are subscribed to the Google Groups
"Django developers (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/django-developers.
To view this discussion on the web visit
https://groups.google.com/d/msgid/django-developers/CAMyDDM2QQyMTQ-9CxYb0SvzoyAv1JMvEm%2BCzTNveBOxpZ0Mx3Q%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.