#33015: QuerySet.extra Use Case: filtering on large lists of tuples -------------------------------------+------------------------------------- Reporter: kris-swann | Owner: nobody Type: Uncategorized | Status: new Component: Database layer | Version: 2.2 (models, ORM) | Severity: Normal | Resolution: Keywords: QuerySet.extra | Triage Stage: | Unreviewed Has patch: 0 | Needs documentation: 0 Needs tests: 0 | Patch needs improvement: 0 Easy pickings: 0 | UI/UX: 0 -------------------------------------+------------------------------------- Changes (by Keryn Knight):
* cc: Keryn Knight (added) Comment: Having been bitten by this same thing multiple times over the years, I'm interested. The performance profile of this is reproducible for me on the `2.2.9` and `3.2.6` tags: {{{ In [5]: %%timeit ...: query = Q() ...: for v1, v2 in items_to_fetch: ...: query |= Q(val1=v1, val2=v2) 21.9 s ± 1.01 s per loop (mean ± std. dev. of 7 runs, 1 loop each) In [7]: %timeit -r3 -n1 x = ExampleModel.objects.filter(query) 48.2 s ± 1.54 s per loop (mean ± std. dev. of 3 runs, 1 loop each) }}} but doesn't appear to be a ''big'' issue on `main` since #32940; my suspicion is that both `WhereNode` and `Q` were hitting the pathological case for `data in self.children` being O(n), in case that rings a bell Simon; below is main as of `8208381ba6`; the problem was present at `501a8db465` and basically gone at `ff661dbd50`: {{{ In [5]: %%timeit ...: query = Q() ...: for v1, v2 in items_to_fetch: ...: query |= Q(val1=v1, val2=v2) 209 ms ± 1.93 ms per loop (mean ± std. dev. of 7 runs, 1 loop each) In [7]: %timeit x = ExampleModel.objects.filter(query) 644 ms ± 22.2 ms per loop (mean ± std. dev. of 7 runs, 1 loop each) }}} Though it's still faster to avoid combining Q objects and kwargs: {{{ In [14]: %timeit Q(*(Q(('val1', v1), ('val2', v2)) for v1, v2 in items_to_fetch), _connector=Q.OR) 15.1 ms ± 761 µs per loop (mean ± std. dev. of 7 runs, 100 loops each) }}} That's true for `2.2.9` too, but regrettably has less impact on the query compilation itself: {{{ In [4]: %timeit Q(*(Q(('val1', v1), ('val2', v2)) for v1, v2 in items_to_fetch), _connector=Q.OR) 22.9 ms ± 12.7 ms per loop (mean ± std. dev. of 7 runs, 100 loops each) In [5]: %timeit -r3 -n1 x = ExampleModel.objects.filter(query) 20.6 s ± 874 ms per loop (mean ± std. dev. of 3 runs, 1 loop each) }}} I don't know that the simple Q based version will ever be as fast as `extra` (as used above) or the `ExpressionTuple` (also above), but it begins to look ''almost'' acceptable, certainly compared to the previous. -- Ticket URL: <https://code.djangoproject.com/ticket/33015#comment:3> Django <https://code.djangoproject.com/> The Web framework for perfectionists with deadlines. -- You received this message because you are subscribed to the Google Groups "Django updates" group. To unsubscribe from this group and stop receiving emails from it, send an email to django-updates+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/django-updates/068.7d3d0b1a2d3d23558ba619a9aa3218a8%40djangoproject.com.