Hi Andres: Could you please share your thoughts on QUERY 3?
The performance impact of this regression increases *linearly* on larger databases. We concur with Andrew in that this is related to the lack of a Materialize node and mis-costing of the Nested Loop Anti-Join. We found more than 20 regressions related to this commit. We have shared two illustrative examples (QUERIES 3A and 3B) below. - Commit: 77cd477 (Enable parallel query by default.) - Summary: Execution Time (milliseconds) When we increased the scale-factor of TPC-C to 300 (~30 GB), this query ran three times slower on v11 (24 seconds) in comparison to v9.5 (7 seconds). We also found more than 15 regressions related to the same commit and share a couple of them below. +-----------------------+--------+---------+---------+-----------+ | | scale1 | scale10 | scale50 | scale 300 | +-----------------------+--------+---------+---------+-----------+ | Query 3 (v9.5) | 28 | 248 | 1231 | 7265 | | Query 3 (v11) | 74 | 677 | 3345 | 24581 | +-----------------------+--------+---------+---------+-----------+ | Query 3A (v9.5) | 88 | 937 | 4721 | 27241 | | Query 3A (v11) | 288 | 2822 | 13838 | 85081 | +-----------------------+--------+---------+---------+-----------+ | Query 3B (v9.5) | 101 | 934 | 4824 | 29363 | | Query 3B (v11) | 200 | 2331 | 12327 | 74110 | +-----------------------+--------+---------+---------+-----------+ ###### QUERY 3: select cast(ref_1.ol_i_id as int4) as c0 from public.stock as ref_0 left join public.order_line as ref_1 on (ref_1.ol_number is not null) where ref_1.ol_number is null ###### QUERY 3A: select ref_0.ol_delivery_d as c1 from public.order_line as ref_0 where EXISTS ( select ref_1.i_im_id as c0 from public.item as ref_1 where ref_0.ol_d_id <= ref_1.i_im_id ) Execution plan: [OLD version] Nested Loop Semi Join (cost=0.00..90020417940.08 rows=30005835 width=8) (actual time=0.034..24981.895 rows=90017507 loops=1) Join Filter: (ref_0.ol_d_id <= ref_1.i_im_id) -> Seq Scan on order_line ref_0 (cost=0.00..2011503.04 rows=90017504 width=12) (actual time=0.022..7145.811 rows=90017507 loops=1) -> Materialize (cost=0.00..2771.00 rows=100000 width=4) (actual time=0.000..0.000 rows=1 loops=90017507) -> Seq Scan on item ref_1 (cost=0.00..2271.00 rows=100000 width=4) (actual time=0.006..0.006 rows=1 loops=1) Planning time: 0.290 ms Execution time: 27241.239 ms [NEW version] Gather (cost=1000.00..88047487498.82 rows=30005835 width=8) (actual time=0.265..82355.289 rows=90017507 loops=1) Workers Planned: 2 Workers Launched: 2 -> Nested Loop Semi Join (cost=0.00..88044485915.32 rows=12502431 width=8) (actual time=0.033..68529.259 rows=30005836 loops=3) Join Filter: (ref_0.ol_d_id <= ref_1.i_im_id) -> Parallel Seq Scan on order_line ref_0 (cost=0.00..1486400.93 rows=37507293 width=12) (actual time=0.023..2789.901 rows=30005836 loops=3) -> Seq Scan on item ref_1 (cost=0.00..2271.00 rows=100000 width=4) (actual time=0.001..0.001 rows=1 loops=90017507) Planning Time: 0.319 ms Execution Time: 85081.158 ms ###### QUERY 3B: select ref_0.ol_i_id as c0 from public.order_line as ref_0 where EXISTS ( select ref_0.ol_delivery_d as c0 from public.order_line as ref_1 where ref_1.ol_d_id <= cast(nullif(ref_1.ol_o_id, ref_0.ol_i_id) as int4)) Execution plan: [OLD version] Nested Loop Semi Join (cost=0.00..115638730740936.53 rows=30005835 width=4) (actual time=0.017..27009.302 rows=90017507 loops=1) Join Filter: (ref_1.ol_d_id <= NULLIF(ref_1.ol_o_id, ref_0.ol_i_id)) Rows Removed by Join Filter: 11557 -> Seq Scan on order_line ref_0 (cost=0.00..2011503.04 rows=90017504 width=4) (actual time=0.009..7199.540 rows=90017507 loops=1) -> Materialize (cost=0.00..2813221.56 rows=90017504 width=8) (actual time=0.000..0.000 rows=1 loops=90017507) -> Seq Scan on order_line ref_1 (cost=0.00..2011503.04 rows=90017504 width=8) (actual time=0.001..0.002 rows=14 loops=1) Planning time: 0.252 ms Execution time: 29363.737 ms [NEW version] Gather (cost=1000.00..84060490326155.39 rows=30005835 width=4) (actual time=0.272..71712.491 rows=90017507 loops=1) Workers Planned: 2 Workers Launched: 2 -> Nested Loop Semi Join (cost=0.00..84060487324571.89 rows=12502431 width=4) (actual time=0.046..60153.472 rows=30005836 loops=3) Join Filter: (ref_1.ol_d_id <= NULLIF(ref_1.ol_o_id, ref_0.ol_i_id)) Rows Removed by Join Filter: 1717 -> Parallel Seq Scan on order_line ref_0 (cost=0.00..1486400.93 rows=37507293 width=4) (actual time=0.023..2819.361 rows=30005836 loops=3) -> Seq Scan on order_line ref_1 (cost=0.00..2011503.04 rows=90017504 width=8) (actual time=0.001..0.001 rows=1 loops=90017507) Planning Time: 0.334 ms Execution Time: 74110.942 ms