#36552: Repeating QuerySet.filter() generates unwanted LEFT OUTER JOINs -------------------------------------+------------------------------------- Reporter: Márton Salomváry | Owner: (none) Type: | Status: closed Cleanup/optimization | Component: Database layer | Version: 5.0 (models, ORM) | Severity: Normal | Resolution: invalid Keywords: chained-filters | Triage Stage: | Unreviewed Has patch: 0 | Needs documentation: 0 Needs tests: 0 | Patch needs improvement: 0 Easy pickings: 0 | UI/UX: 0 -------------------------------------+------------------------------------- Changes (by Natalia Bidart):
* cc: Simon Charette (added) * keywords: queryset filter regression => chained-filters * resolution: => invalid * status: new => closed * type: Bug => Cleanup/optimization Comment: Hello Márton Salomváry, thank you for your report, as well as for the test project and bisected revision number. In general, Django does not make guarantees about the exact SQL that is generated. This means SQL can change from version to version, and we would not consider that a bug as long as the semantics of the query remain the same (which, based on my testing, is the case here). The resulting queryset still returns the correct values. That said, we do care about potential performance impacts and/or regressions. When triaging, I focused on how this change affects query planning, and I can confirm that the query plan is different/worse (see EXPLAIN ANALYZE results for Postgres comparing `4.2.x` vs `main`). I do wonder, however, whether the way the queries are currently written might be more complex than necessary, which could contribute to these differences in execution plans. For example, this alternative queryset is, IMHO, equivalent and much more straightfoward: {{{#!python Alpha.objects.filter( Q(bravos__charlie_bravos__charlie=9999) & Exists( Delta.objects.filter( bravos__id=OuterRef("bravos__charlie_bravos__bravo_id") ) ) ) }}} Because of the above, I'll close the ticket as `invalid`. If you encounter a case where a query written in a way that aligns with Django's ORM patterns (as shown above) returns incorrect results or performs poorly, please feel free to reopen it with details. Thanks again! === Appendix === `EXPLAIN ANALYZE` for `stable/4.2.x`: {{{ Nested Loop (cost=7.92..43.22 rows=6 width=122) (actual time=0.010..0.013 rows=1 loops=1) Output: test_filter_regression_alpha.id, test_filter_regression_alpha.name Inner Unique: true -> Nested Loop (cost=7.64..41.29 rows=6 width=4) (actual time=0.009..0.011 rows=1 loops=1) Output: test_filter_regression_bravo_alphas.alpha_id -> Nested Loop (cost=7.49..40.26 rows=2 width=12) (actual time=0.008..0.010 rows=1 loops=1) Output: test_filter_regression_bravo.id, u1.bravo_id, test_filter_regression_charliebravo.bravo_id Inner Unique: true Join Filter: (test_filter_regression_bravo.id = test_filter_regression_charliebravo.bravo_id) -> Nested Loop Semi Join (cost=7.22..39.05 rows=2 width=8) (actual time=0.007..0.008 rows=1 loops=1) Output: test_filter_regression_charliebravo.bravo_id, u1.bravo_id -> Bitmap Heap Scan on public.test_filter_regression_charliebravo (cost=4.17..11.28 rows=3 width=4) (actual time=0.003..0.003 rows=2 loops=1) Output: test_filter_regression_charliebravo.id, test_filter_regression_charliebravo.name, test_filter_regression_charliebravo.charlie_id, test_filter_regression_charliebravo.bravo_id Recheck Cond: (test_filter_regression_charliebravo.charlie_id = 9999) Heap Blocks: exact=1 -> Bitmap Index Scan on test_filter_regression_charliebravo_charlie_id_1f151049 (cost=0.00..4.17 rows=3 width=0) (actual time=0.002..0.002 rows=2 loops=1) Index Cond: (test_filter_regression_charliebravo.charlie_id = 9999) -> Nested Loop (cost=3.05..13.34 rows=540 width=4) (actual time=0.002..0.002 rows=0 loops=2) Output: u1.bravo_id Inner Unique: true -> Bitmap Heap Scan on public.test_filter_regression_delta_bravos u1 (cost=2.90..11.43 rows=10 width=8) (actual time=0.001..0.001 rows=0 loops=2) Output: u1.id, u1.delta_id, u1.bravo_id Recheck Cond: (test_filter_regression_charliebravo.bravo_id = u1.bravo_id) Heap Blocks: exact=1 -> Bitmap Index Scan on test_filter_regression_delta_bravos_bravo_id_5b518e55 (cost=0.00..2.89 rows=10 width=0) (actual time=0.000..0.000 rows=0 loops=2) Index Cond: (u1.bravo_id = test_filter_regression_charliebravo.bravo_id) -> Index Only Scan using test_filter_regression_delta_pkey on public.test_filter_regression_delta u0 (cost=0.15..0.19 rows=1 width=4) (actual time=0.001..0.001 rows=1 loops=1) Output: u0.id Index Cond: (u0.id = u1.delta_id) Heap Fetches: 1 -> Index Only Scan using test_filter_regression_bravo_pkey on public.test_filter_regression_bravo (cost=0.28..0.59 rows=1 width=4) (actual time=0.001..0.001 rows=1 loops=1) Output: test_filter_regression_bravo.id Index Cond: (test_filter_regression_bravo.id = u1.bravo_id) Heap Fetches: 1 -> Index Scan using test_filter_regression_bravo_alphas_bravo_id_042061e7 on public.test_filter_regression_bravo_alphas (cost=0.15..0.42 rows=10 width=8) (actual time=0.001..0.001 rows=1 loops=1) Output: test_filter_regression_bravo_alphas.id, test_filter_regression_bravo_alphas.bravo_id, test_filter_regression_bravo_alphas.alpha_id Index Cond: (test_filter_regression_bravo_alphas.bravo_id = test_filter_regression_bravo.id) -> Index Scan using test_filter_regression_alpha_pkey on public.test_filter_regression_alpha (cost=0.28..0.32 rows=1 width=122) (actual time=0.001..0.001 rows=1 loops=1) Output: test_filter_regression_alpha.id, test_filter_regression_alpha.name Index Cond: (test_filter_regression_alpha.id = test_filter_regression_bravo_alphas.alpha_id) Planning Time: 0.382 ms Execution Time: 0.027 ms }}} For `main`: {{{ Hash Join (cost=79.87..131.14 rows=20 width=122) (actual time=0.074..0.171 rows=1 loops=1) Output: test_filter_regression_alpha.id, test_filter_regression_alpha.name Inner Unique: true Hash Cond: (t6.bravo_id = u1.bravo_id) -> Nested Loop (cost=12.32..63.27 rows=40 width=134) (actual time=0.046..0.143 rows=1 loops=1) Output: test_filter_regression_alpha.id, test_filter_regression_alpha.name, t6.bravo_id, t7.id, t8.bravo_id -> Nested Loop (cost=12.17..52.62 rows=42 width=130) (actual time=0.041..0.137 rows=1 loops=1) Output: test_filter_regression_alpha.id, test_filter_regression_alpha.name, t6.bravo_id, t7.id Inner Unique: true -> Nested Loop (cost=11.90..39.10 rows=42 width=126) (actual time=0.034..0.130 rows=1 loops=1) Output: test_filter_regression_alpha.id, test_filter_regression_alpha.name, t6.bravo_id Join Filter: (test_filter_regression_alpha.id = t6.alpha_id) -> Nested Loop (cost=11.74..33.86 rows=11 width=126) (actual time=0.030..0.126 rows=1 loops=1) Output: test_filter_regression_alpha.id, test_filter_regression_alpha.name, test_filter_regression_bravo_alphas.alpha_id Inner Unique: true -> Nested Loop (cost=11.47..30.32 rows=11 width=4) (actual time=0.026..0.121 rows=1 loops=1) Output: test_filter_regression_bravo_alphas.alpha_id -> Hash Join (cost=11.32..28.77 rows=3 width=8) (actual time=0.018..0.112 rows=2 loops=1) Output: test_filter_regression_bravo.id, test_filter_regression_charliebravo.bravo_id Hash Cond: (test_filter_regression_bravo.id = test_filter_regression_charliebravo.bravo_id) -> Seq Scan on public.test_filter_regression_bravo (cost=0.00..15.40 rows=540 width=4) (actual time=0.003..0.051 rows=900 loops=1) Output: test_filter_regression_bravo.id, test_filter_regression_bravo.name -> Hash (cost=11.28..11.28 rows=3 width=4) (actual time=0.006..0.006 rows=2 loops=1) Output: test_filter_regression_charliebravo.bravo_id Buckets: 1024 Batches: 1 Memory Usage: 9kB -> Bitmap Heap Scan on public.test_filter_regression_charliebravo (cost=4.17..11.28 rows=3 width=4) (actual time=0.003..0.003 rows=2 loops=1) Output: test_filter_regression_charliebravo.bravo_id Recheck Cond: (test_filter_regression_charliebravo.charlie_id = 9999) Heap Blocks: exact=1 -> Bitmap Index Scan on test_filter_regression_charliebravo_charlie_id_1f151049 (cost=0.00..4.17 rows=3 width=0) (actual time=0.002..0.002 rows=2 loops=1) Index Cond: (test_filter_regression_charliebravo.charlie_id = 9999) -> Index Scan using test_filter_regression_bravo_alphas_bravo_id_042061e7 on public.test_filter_regression_bravo_alphas (cost=0.15..0.42 rows=10 width=8) (actual time=0.004..0.004 rows=0 loops=2) Output: test_filter_regression_bravo_alphas.id, test_filter_regression_bravo_alphas.bravo_id, test_filter_regression_bravo_alphas.alpha_id Index Cond: (test_filter_regression_bravo_alphas.bravo_id = test_filter_regression_bravo.id) -> Index Scan using test_filter_regression_alpha_pkey on public.test_filter_regression_alpha (cost=0.28..0.32 rows=1 width=122) (actual time=0.004..0.004 rows=1 loops=1) Output: test_filter_regression_alpha.id, test_filter_regression_alpha.name Index Cond: (test_filter_regression_alpha.id = test_filter_regression_bravo_alphas.alpha_id) -> Index Scan using test_filter_regression_bravo_alphas_alpha_id_b2c31687 on public.test_filter_regression_bravo_alphas t6 (cost=0.15..0.35 rows=10 width=8) (actual time=0.004..0.004 rows=1 loops=1) Output: t6.id, t6.bravo_id, t6.alpha_id Index Cond: (t6.alpha_id = test_filter_regression_bravo_alphas.alpha_id) -> Index Only Scan using test_filter_regression_bravo_pkey on public.test_filter_regression_bravo t7 (cost=0.28..0.32 rows=1 width=4) (actual time=0.007..0.007 rows=1 loops=1) Output: t7.id Index Cond: (t7.id = t6.bravo_id) Heap Fetches: 1 -> Index Only Scan using test_filter_regression_charliebravo_bravo_id_1b159238 on public.test_filter_regression_charliebravo t8 (cost=0.15..0.22 rows=3 width=4) (actual time=0.005..0.005 rows=1 loops=1) Output: t8.bravo_id Index Cond: (t8.bravo_id = t6.bravo_id) Heap Fetches: 1 -> Hash (cost=65.05..65.05 rows=200 width=4) (actual time=0.020..0.021 rows=2 loops=1) Output: u1.bravo_id Buckets: 1024 Batches: 1 Memory Usage: 9kB -> HashAggregate (cost=63.05..65.05 rows=200 width=4) (actual time=0.017..0.017 rows=2 loops=1) Output: u1.bravo_id Group Key: u1.bravo_id Batches: 1 Memory Usage: 40kB -> Hash Join (cost=22.15..57.95 rows=2040 width=4) (actual time=0.015..0.016 rows=2 loops=1) Output: u1.bravo_id Inner Unique: true Hash Cond: (u1.delta_id = u0.id) -> Seq Scan on public.test_filter_regression_delta_bravos u1 (cost=0.00..30.40 rows=2040 width=8) (actual time=0.001..0.001 rows=2 loops=1) Output: u1.id, u1.delta_id, u1.bravo_id -> Hash (cost=15.40..15.40 rows=540 width=4) (actual time=0.004..0.004 rows=2 loops=1) Output: u0.id Buckets: 1024 Batches: 1 Memory Usage: 9kB -> Seq Scan on public.test_filter_regression_delta u0 (cost=0.00..15.40 rows=540 width=4) (actual time=0.001..0.001 rows=2 loops=1) Output: u0.id Planning Time: 2.462 ms Execution Time: 0.238 ms }}} For `main` with optimized queryset: {{{ Nested Loop (cost=7.92..43.22 rows=6 width=122) (actual time=0.010..0.012 rows=1 loops=1) Output: test_filter_regression_alpha.id, test_filter_regression_alpha.name Inner Unique: true -> Nested Loop (cost=7.64..41.29 rows=6 width=4) (actual time=0.009..0.011 rows=1 loops=1) Output: test_filter_regression_bravo_alphas.alpha_id -> Nested Loop (cost=7.49..40.26 rows=2 width=12) (actual time=0.008..0.010 rows=1 loops=1) Output: test_filter_regression_bravo.id, u1.bravo_id, test_filter_regression_charliebravo.bravo_id Inner Unique: true Join Filter: (test_filter_regression_bravo.id = test_filter_regression_charliebravo.bravo_id) -> Nested Loop Semi Join (cost=7.22..39.05 rows=2 width=8) (actual time=0.006..0.008 rows=1 loops=1) Output: test_filter_regression_charliebravo.bravo_id, u1.bravo_id -> Bitmap Heap Scan on public.test_filter_regression_charliebravo (cost=4.17..11.28 rows=3 width=4) (actual time=0.003..0.003 rows=2 loops=1) Output: test_filter_regression_charliebravo.id, test_filter_regression_charliebravo.name, test_filter_regression_charliebravo.charlie_id, test_filter_regression_charliebravo.bravo_id Recheck Cond: (test_filter_regression_charliebravo.charlie_id = 9999) Heap Blocks: exact=1 -> Bitmap Index Scan on test_filter_regression_charliebravo_charlie_id_1f151049 (cost=0.00..4.17 rows=3 width=0) (actual time=0.002..0.002 rows=2 loops=1) Index Cond: (test_filter_regression_charliebravo.charlie_id = 9999) -> Nested Loop (cost=3.05..13.34 rows=540 width=4) (actual time=0.002..0.002 rows=0 loops=2) Output: u1.bravo_id Inner Unique: true -> Bitmap Heap Scan on public.test_filter_regression_delta_bravos u1 (cost=2.90..11.43 rows=10 width=8) (actual time=0.001..0.001 rows=0 loops=2) Output: u1.id, u1.delta_id, u1.bravo_id Recheck Cond: (test_filter_regression_charliebravo.bravo_id = u1.bravo_id) Heap Blocks: exact=1 -> Bitmap Index Scan on test_filter_regression_delta_bravos_bravo_id_5b518e55 (cost=0.00..2.89 rows=10 width=0) (actual time=0.000..0.000 rows=0 loops=2) Index Cond: (u1.bravo_id = test_filter_regression_charliebravo.bravo_id) -> Index Only Scan using test_filter_regression_delta_pkey on public.test_filter_regression_delta u0 (cost=0.15..0.19 rows=1 width=4) (actual time=0.001..0.001 rows=1 loops=1) Output: u0.id Index Cond: (u0.id = u1.delta_id) Heap Fetches: 1 -> Index Only Scan using test_filter_regression_bravo_pkey on public.test_filter_regression_bravo (cost=0.28..0.59 rows=1 width=4) (actual time=0.001..0.001 rows=1 loops=1) Output: test_filter_regression_bravo.id Index Cond: (test_filter_regression_bravo.id = u1.bravo_id) Heap Fetches: 1 -> Index Scan using test_filter_regression_bravo_alphas_bravo_id_042061e7 on public.test_filter_regression_bravo_alphas (cost=0.15..0.42 rows=10 width=8) (actual time=0.001..0.001 rows=1 loops=1) Output: test_filter_regression_bravo_alphas.id, test_filter_regression_bravo_alphas.bravo_id, test_filter_regression_bravo_alphas.alpha_id Index Cond: (test_filter_regression_bravo_alphas.bravo_id = test_filter_regression_bravo.id) -> Index Scan using test_filter_regression_alpha_pkey on public.test_filter_regression_alpha (cost=0.28..0.32 rows=1 width=122) (actual time=0.001..0.001 rows=1 loops=1) Output: test_filter_regression_alpha.id, test_filter_regression_alpha.name Index Cond: (test_filter_regression_alpha.id = test_filter_regression_bravo_alphas.alpha_id) Planning Time: 0.381 ms Execution Time: 0.027 ms }}} -- Ticket URL: <https://code.djangoproject.com/ticket/36552#comment:3> Django <https://code.djangoproject.com/> The Web framework for perfectionists with deadlines. -- You received this message because you are subscribed to the Google Groups "Django updates" group. To unsubscribe from this group and stop receiving emails from it, send an email to django-updates+unsubscr...@googlegroups.com. To view this discussion visit https://groups.google.com/d/msgid/django-updates/01070198bddf0a0f-eb4187b0-0010-4d22-9125-a5c8a4c27e29-000000%40eu-central-1.amazonses.com.