#32632: Query resolution can be at least 3x slower in 3.2
-------------------------------------+-------------------------------------
Reporter: Phillip Cutter | Owner: Simon
| Charette
Type: Bug | Status: assigned
Component: Database layer | Version: 3.2
(models, ORM) |
Severity: Release blocker | Resolution:
Keywords: | Triage Stage: Accepted
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Comment (by Simon Charette):
Alright so I did a bit of digging in the past hour and here are my
findings
Before bbf141bcdc31f1324048af9233583a523ac54c94 `Query.__eq__` was wrongly
returning `True` due to making it a subclass `BaseExpression` in
35431298226165986ad07e91f9d3aca721ff38ec. Before these changes
`Query.__eq__` was not implemented which meant that
`Model.objects.filter(foo=bar).query !=
Model.objects.filter(foo=bar).query` as `object.__eq__` is based of
`id(self) == id(other)`. This is problematic for expressions such as
`Exists` and `Subquery` which are based off `sql.Query` and need to be
able to compare to each other to implement the ''expression'' API.
Since `sql.Query` is an immutable internal object we could possibly
`cached_property(identity)` and move along as that cuts the slowdown in
about half, that does feel like a bandaid solution though. It's
interesting to me that `Node.add` would compare all its children in the
first place as that seems like a costly operation but I do wonder if
After playing around with a few things here's a list of two proposed
solutions
1. Remove the `tree.Node.add` branch for `if data in self.children`. While
[https://djangoci.com/job/django-coverage/HTML_20Coverage_20Report/ it's
covered by the suite] only
[https://github.com/django/django/blob/ca9872905559026af82000e46cde6f7dedc897b6/tests/queries/tests.py#L1737-L1747
a single test in the whole suite reaches it] the branch under arguably
convoluted conditions while the check is performed 99% of the time when
`filter` and `exclude` are called to perform expensive comparisons of
contained expressions. I wouldn't be surprised if it was at the origin of
others recent slowdowns in the ORM query construction due to the
introduction of `BaseExpression.identity` to needs for deconstruc-ability
imposed by `Index(condition)` and `Constraint(condition)` for little
benefit.
2. Make `Subquery` (and `Exists`) explicitly non-deconstructible; in
practice they already are because `sql.Query` doesn't properly implement
`deconstruct` so there's little value in implementing `identity` for them
in the first place.
3. Remove the `sql.Query(BaseExpression)` subclassing as it's no longer
necessary in practice. That'll have the side effect of removing support
for `Model.objects.filter(foo=bar).query ==
Model.objects.filter(foo=bar).query` which is a nice property but since we
don't internally have a need for it ''yet'' there's little point in the
complexity it incurs.
I'd suggest we backport 2. and 3. to address this ticket and move the
discussion about 1. in a new ''Cleanup/optimization'' one. Let me know if
that makes sense to you and I'll commit to a PR before the end of April.
--
Ticket URL: <https://code.djangoproject.com/ticket/32632#comment:4>
Django <https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.
--
You received this message because you are subscribed to the Google Groups
"Django updates" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To view this discussion on the web visit
https://groups.google.com/d/msgid/django-updates/065.c3d17e4900770e224ae287b315a50ce7%40djangoproject.com.