#28455: Create "inplace" QuerySets to speed up certain operations
-------------------------------------+-------------------------------------
Reporter: Anssi Kääriäinen | Owner: Keryn
Type: | Knight
Cleanup/optimization | Status: assigned
Component: Database layer | Version: dev
(models, ORM) |
Severity: Normal | Resolution:
Keywords: | Triage Stage: Accepted
Has patch: 1 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 1
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by Keryn Knight):
* needs_better_patch: 0 => 1
* has_patch: 0 => 1
Comment:
Updated slightly, and now I've sat down to get cProfile information using
`%prun for _ in range(100):
tuple(User.objects.prefetch_related('groups__permissions',
'user_permissions'))`
First, the baseline, showing only operations related to the change, for
brevity (so no `Model.__init__` etc):
{{{
ncalls tottime percall cumtime percall filename:lineno(function)
31200 0.165 0.000 0.303 0.000 query.py:290(clone)
300 0.098 0.000 3.134 0.010
query.py:1860(prefetch_one_level)
31200 0.066 0.000 0.444 0.000 query.py:1337(_clone)
30200 0.055 0.000 0.655 0.000
related_descriptors.py:883(_apply_rel_filters)
40600 0.052 0.000 1.231 0.000 query.py:45(__iter__)
30200 0.051 0.000 0.897 0.000
related_descriptors.py:899(get_queryset)
31200 0.046 0.000 0.500 0.000 query.py:1325(_chain)
40200 0.041 0.000 0.327 0.000 base.py:511(from_db)
30500 0.039 0.000 0.931 0.000
query.py:982(_filter_or_exclude)
30600 0.030 0.000 0.194 0.000 manager.py:142(get_queryset)
31200 0.029 0.000 0.337 0.000 query.py:341(chain)
31200 0.028 0.000 0.070 0.000 where.py:142(clone)
}}}
Using the `@contextmanager` decorator. Lines are shown in the same order
as the above, so they're technically ordered by the baseline's internal
time:
{{{
ncalls tottime percall cumtime percall filename:lineno(function)
1000 0.007 0.000 0.013 0.000 query.py:290(clone)
300 0.082 0.000 2.932 0.010
query.py:1881(prefetch_one_level)
1000 0.003 0.000 0.019 0.000 query.py:1358(_clone)
30200 0.097 0.000 0.444 0.000
related_descriptors.py:884(_apply_rel_filters)
40600 0.052 0.000 0.947 0.000 query.py:46(__iter__)
30200 0.054 0.000 1.039 0.000
related_descriptors.py:901(get_queryset)
31200 0.029 0.000 0.056 0.000 query.py:1343(_chain)
40200 0.041 0.000 0.376 0.000 base.py:511(from_db)
30500 0.041 0.000 0.496 0.000
query.py:984(_filter_or_exclude)
30600 0.032 0.000 0.545 0.000 manager.py:142(get_queryset)
1000 0.001 0.000 0.014 0.000 query.py:341(chain)
1000 0.002 0.000 0.003 0.000 where.py:142(clone)
...
30600 0.081 0.000 0.086 0.000 contextlib.py:86(__init__)
60400 0.024 0.000 0.034 0.000
query.py:1335(_avoid_cloning)
30600 0.017 0.000 0.067 0.000 contextlib.py:121(__exit__)
30600 0.016 0.000 0.102 0.000 contextlib.py:242(helper)
30600 0.016 0.000 0.042 0.000 contextlib.py:112(__enter__)
}}}
And then finally with a custom context manager class, again using the same
ordering as the baseline:
{{{
ncalls tottime percall cumtime percall filename:lineno(function)
1000 0.010 0.000 0.016 0.000 query.py:290(clone)
300 0.095 0.000 3.293 0.011
query.py:1888(prefetch_one_level)
1000 0.004 0.000 0.023 0.000 query.py:1365(_clone)
30200 0.102 0.000 0.412 0.000
related_descriptors.py:884(_apply_rel_filters)
40600 0.062 0.000 1.133 0.000 query.py:46(__iter__)
30200 0.063 0.000 1.032 0.000
related_descriptors.py:901(get_queryset)
31200 0.077 0.000 0.111 0.000 query.py:1350(_chain)
40200 0.049 0.000 0.391 0.000 base.py:511(from_db)
30500 0.070 0.000 0.650 0.000
query.py:996(_filter_or_exclude)
30600 0.038 0.000 0.563 0.000 manager.py:142(get_queryset)
1000 0.002 0.000 0.018 0.000 query.py:341(chain)
1000 0.002 0.000 0.004 0.000 where.py:142(clone)
...
30200 0.006 0.000 0.006 0.000 query.py:178(__init__)
30200 0.016 0.000 0.022 0.000
query.py:1347(_avoid_cloning)
30200 0.014 0.000 0.020 0.000 query.py:184(__exit__)
30200 0.015 0.000 0.021 0.000 query.py:181(__enter__)
}}}
Using the custom context manager is substantially faster than the
`contextlib` decorator, at least in the simple form I currently have it.
Here's the decorator version:
{{{
In [1]: from django.contrib.auth.models import User
In [2]: x = User.objects.all()
In [3]: %timeit x._avoid_cloning()
1.01 µs ± 18.6 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops
each)
In [4]: %timeit with x._avoid_cloning(): pass
1.98 µs ± 63.9 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
}}}
and then using the custom class:
{{{
In [3]: %timeit x._avoid_cloning()
276 ns ± 2.14 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
In [4]: %timeit with x._avoid_cloning(): pass
819 ns ± 39.4 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
}}}
py-spy (sampling profiler) also suggests `sql.Query.clone` is 3% of the
time spent, where it doesn't even look to get sampled enough to be
relevant, once cloning avoidance is introduced.
Meanwhile linters are the bane of my life and so the patch remains 'needs
improvement'
--
Ticket URL: <https://code.djangoproject.com/ticket/28455#comment:5>
Django <https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.
--
You received this message because you are subscribed to the Google Groups
"Django updates" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To view this discussion on the web visit
https://groups.google.com/d/msgid/django-updates/066.385f4af0cb25b4d753efaabb84c32062%40djangoproject.com.