Hi hackers, While studying a regression reported[1] against my parallel hash join patch, I noticed that we can also reach a good and a bad plan in unpatched master. One of the causes seems to be the estimated selectivity of a semi-join with an extra <> filter qual.
Here are some times I measured for TPCH Q21 at scale 10 and work_mem of 1GB. That is a query with a large anti-join and a large semi-join. 8 workers = 8.3s 7 workers = 8.2s 6 workers = 8.5s 5 workers = 8.9s 4 workers = 9.5s 3 workers = 39.7s 2 workers = 36.9s 1 worker = 38.2s 0 workers = 47.9s Please see the attached query plans showing the change in plan from Hash Semi Join to Nested Loop Semi Join that happens only once we reach 4 workers and the (partial) base relation size becomes smaller. The interesting thing is that row estimate for the semi-join and anti-join come out as 1 (I think this is 0 clamped to 1). The same thing can be seen with a simple semi-join, if you happen to have TPCH loaded. Compare these two queries: SELECT * FROM lineitem l1 WHERE EXISTS (SELECT * FROM lineitem l2 WHERE l1.l_orderkey = l2.l_orderkey); -> estimates 59986012 rows, actual rows 59,986,052 (scale 10 TPCH) SELECT * FROM lineitem l1 WHERE EXISTS (SELECT * FROM lineitem l2 WHERE l1.l_orderkey = l2.l_orderkey AND l1.l_suppkey <> l2.l_suppkey); -> estimates 1 row, actual rows 57,842,090 (scale 10 TPCH) Or for a standalone example: CREATE TABLE foo AS SELECT (generate_series(1, 1000000) / 4)::int AS a, (generate_series(1, 1000000) % 100)::int AS b; ANALYZE foo; SELECT * FROM foo f1 WHERE EXISTS (SELECT * FROM foo f2 WHERE f1.a = f2.a); -> estimates 1,000,000 rows SELECT * FROM foo f1 WHERE EXISTS (SELECT * FROM foo f2 WHERE f1.a = f2.a AND f1.b <> f2.b); -> estimates 1 row I'm trying to wrap my brain around the selectivity code, but am too green to grok how this part of the planner that I haven't previously focused on works so far, and I'd like to understand whether this is expected behaviour so that I can figure out how to tackle the reported regression with my patch. What is happening here? Thanks for reading. [1] https://www.postgresql.org/message-id/CAEepm%3D3Og-7-b3WOkiT%3Dc%2B6y3eZ0VVSyb1K%2BSOvF17BO5KAt0A%40mail.gmail.com -- Thomas Munro http://www.enterprisedb.com
QUERY PLAN --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- Limit (cost=4642943.81..4642943.82 rows=1 width=34) (actual time=39806.068..39806.080 rows=100 loops=1) -> Sort (cost=4642943.81..4642943.82 rows=1 width=34) (actual time=39806.064..39806.070 rows=100 loops=1) Sort Key: (count(*)) DESC, supplier.s_name Sort Method: top-N heapsort Memory: 38kB -> GroupAggregate (cost=4642943.78..4642943.80 rows=1 width=34) (actual time=39796.916..39804.615 rows=3945 loops=1) Group Key: supplier.s_name -> Sort (cost=4642943.78..4642943.79 rows=1 width=26) (actual time=39796.906..39800.147 rows=38905 loops=1) Sort Key: supplier.s_name Sort Method: quicksort Memory: 4576kB -> Nested Loop Anti Join (cost=2861152.85..4642943.77 rows=1 width=26) (actual time=19851.808..39565.632 rows=38905 loops=1) -> Nested Loop (cost=2861152.29..4642900.65 rows=1 width=42) (actual time=19850.550..35747.936 rows=696628 loops=1) -> Gather (cost=2861151.85..4642893.24 rows=1 width=50) (actual time=19850.323..29654.121 rows=1441593 loops=1) Workers Planned: 3 Workers Launched: 3 -> Hash Semi Join (cost=2860151.85..4641893.14 rows=1 width=50) (actual time=21036.398..30323.561 rows=360398 loops=4) Hash Cond: (l1.l_orderkey = l2.l_orderkey) Join Filter: (l2.l_suppkey <> l1.l_suppkey) Rows Removed by Join Filter: 93610 -> Hash Join (cost=2585.58..1486212.61 rows=258004 width=42) (actual time=17.681..3985.606 rows=373726 loops=4) Hash Cond: (l1.l_suppkey = supplier.s_suppkey) -> Parallel Seq Scan on lineitem l1 (cost=0.00..1456859.08 rows=6450109 width=16) (actual time=0.031..2986.281 rows=9482337 loops=4) Filter: (l_receiptdate > l_commitdate) Rows Removed by Filter: 5514176 -> Hash (cost=2535.58..2535.58 rows=4000 width=30) (actual time=17.410..17.410 rows=3945 loops=4) Buckets: 4096 Batches: 1 Memory Usage: 279kB -> Nested Loop (cost=79.29..2535.58 rows=4000 width=30) (actual time=1.872..15.971 rows=3945 loops=4) -> Seq Scan on nation (cost=0.00..1.31 rows=1 width=4) (actual time=0.026..0.034 rows=1 loops=4) Filter: (n_name = 'ETHIOPIA'::bpchar) Rows Removed by Filter: 24 -> Bitmap Heap Scan on supplier (cost=79.29..2494.27 rows=4000 width=38) (actual time=1.838..14.950 rows=3945 loops=4) Recheck Cond: (s_nationkey = nation.n_nationkey) Heap Blocks: exact=1898 -> Bitmap Index Scan on idx_supplier_nation_key (cost=0.00..78.29 rows=4000 width=0) (actual time=1.309..1.309 rows=3945 loops=4) Index Cond: (s_nationkey = nation.n_nationkey) -> Hash (cost=1814840.12..1814840.12 rows=59986012 width=16) (actual time=20846.345..20846.345 rows=59986052 loops=4) Buckets: 33554432 Batches: 4 Memory Usage: 965240kB -> Seq Scan on lineitem l2 (cost=0.00..1814840.12 rows=59986012 width=16) (actual time=0.046..11007.730 rows=59986052 loops=4) -> Index Scan using orders_pkey on orders (cost=0.43..7.40 rows=1 width=4) (actual time=0.004..0.004 rows=0 loops=1441593) Index Cond: (o_orderkey = l1.l_orderkey) Filter: (o_orderstatus = 'F'::bpchar) Rows Removed by Filter: 1 -> Index Scan using idx_lineitem_orderkey on lineitem l3 (cost=0.56..21.58 rows=53 width=16) (actual time=0.005..0.005 rows=1 loops=696628) Index Cond: (l_orderkey = l1.l_orderkey) Filter: ((l_receiptdate > l_commitdate) AND (l_suppkey <> l1.l_suppkey)) Rows Removed by Filter: 1 Planning time: 4.970 ms Execution time: 39809.977 ms (47 rows)
QUERY PLAN --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- Limit (cost=4216591.95..4216591.95 rows=1 width=34) (actual time=9489.945..9489.957 rows=100 loops=1) -> Sort (cost=4216591.95..4216591.95 rows=1 width=34) (actual time=9489.943..9489.949 rows=100 loops=1) Sort Key: (count(*)) DESC, supplier.s_name Sort Method: top-N heapsort Memory: 38kB -> GroupAggregate (cost=4216591.92..4216591.94 rows=1 width=34) (actual time=9480.290..9488.403 rows=3945 loops=1) Group Key: supplier.s_name -> Sort (cost=4216591.92..4216591.92 rows=1 width=26) (actual time=9480.278..9483.654 rows=38905 loops=1) Sort Key: supplier.s_name Sort Method: quicksort Memory: 4576kB -> Nested Loop Anti Join (cost=559157.24..4216591.91 rows=1 width=26) (actual time=4154.990..9198.379 rows=38905 loops=1) -> Gather (cost=559156.67..4216548.78 rows=1 width=42) (actual time=4153.557..7185.268 rows=696628 loops=1) Workers Planned: 4 Workers Launched: 4 -> Nested Loop Semi Join (cost=558156.67..4215548.68 rows=1 width=42) (actual time=4391.022..8656.693 rows=139326 loops=5) Join Filter: (l2.l_suppkey <> l1.l_suppkey) Rows Removed by Join Filter: 36246 -> Hash Join (cost=558156.11..1984566.52 rows=97950 width=46) (actual time=4390.978..7970.236 rows=144548 loops=5) Hash Cond: (l1.l_orderkey = orders.o_orderkey) -> Hash Join (cost=2585.58..1425767.03 rows=199953 width=42) (actual time=20.784..3405.518 rows=298981 loops=5) Hash Cond: (l1.l_suppkey = supplier.s_suppkey) -> Parallel Seq Scan on lineitem l1 (cost=0.00..1402436.29 rows=4998834 width=16) (actual time=0.040..2570.422 rows=7585870 loops=5) Filter: (l_receiptdate > l_commitdate) Rows Removed by Filter: 4411341 -> Hash (cost=2535.58..2535.58 rows=4000 width=30) (actual time=20.458..20.458 rows=3945 loops=5) Buckets: 4096 Batches: 1 Memory Usage: 279kB -> Nested Loop (cost=79.29..2535.58 rows=4000 width=30) (actual time=1.921..18.996 rows=3945 loops=5) -> Seq Scan on nation (cost=0.00..1.31 rows=1 width=4) (actual time=0.033..0.041 rows=1 loops=5) Filter: (n_name = 'ETHIOPIA'::bpchar) Rows Removed by Filter: 24 -> Bitmap Heap Scan on supplier (cost=79.29..2494.27 rows=4000 width=38) (actual time=1.881..17.968 rows=3945 loops=5) Recheck Cond: (s_nationkey = nation.n_nationkey) Heap Blocks: exact=1898 -> Bitmap Index Scan on idx_supplier_nation_key (cost=0.00..78.29 rows=4000 width=0) (actual time=1.363..1.363 rows=3945 loops=5) Index Cond: (s_nationkey = nation.n_nationkey) -> Hash (cost=463721.01..463721.01 rows=7347961 width=4) (actual time=4328.438..4328.438 rows=7309184 loops=5) Buckets: 8388608 Batches: 1 Memory Usage: 322500kB -> Seq Scan on orders (cost=0.00..463721.01 rows=7347961 width=4) (actual time=0.040..2856.374 rows=7309184 loops=5) Filter: (o_orderstatus = 'F'::bpchar) Rows Removed by Filter: 7690816 -> Index Scan using idx_lineitem_orderkey on lineitem l2 (cost=0.56..20.79 rows=159 width=16) (actual time=0.004..0.004 rows=1 loops=722740) Index Cond: (l_orderkey = orders.o_orderkey) -> Index Scan using idx_lineitem_orderkey on lineitem l3 (cost=0.56..21.58 rows=53 width=16) (actual time=0.003..0.003 rows=1 loops=696628) Index Cond: (l_orderkey = l1.l_orderkey) Filter: ((l_receiptdate > l_commitdate) AND (l_suppkey <> l1.l_suppkey)) Rows Removed by Filter: 1 Planning time: 4.495 ms Execution time: 9493.733 ms (47 rows)
-- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers