Hello,Postgres can produce a plan with a nested loop node having rows estimate much more than the product of underlying nodes' estimates, relying only on outer relation size:
alexey=# explain
SELECT oid, relname
FROM (
SELECT m.oid, m.relname
FROM pg_class m
UNION ALL
SELECT m.oid, m.relname
FROM pg_class m
) m
WHERE oid IN (VALUES (162456317), (162456310));
QUERY PLAN
----------------------------------------------------------------------------------------------------
Nested Loop (cost=0.31..33.24 rows=*341* width=68)
-> Unique (cost=0.04..0.04 rows=*2* width=4)
-> Sort (cost=0.04..0.04 rows=2 width=4)
Sort Key: (("*VALUES*".column1)::oid)
-> Values Scan on "*VALUES*" (cost=0.00..0.03 rows=2
width=4)
-> Append (cost=0.27..16.58 rows=*2* width=68)-> Index Scan using pg_class_oid_index on pg_class m (cost=0.27..8.29 rows=1 width=68)
Index Cond: (oid = ("*VALUES*".column1)::oid)
-> Index Scan using pg_class_oid_index on pg_class m_1
(cost=0.27..8.29 rows=1 width=68)
Index Cond: (oid = ("*VALUES*".column1)::oid)
(10 rows)
Why?
Is there a reason that join cardinality estimates are not limited by the
product of the joined parts cardinalities like in the
join-card-est.patch attached?
An example of a query working faster as a result of this change is in
join-card-est.sql, result is in join-card-est.result
Best Regards, Alexey
diff --git a/src/backend/optimizer/path/costsize.c b/src/backend/optimizer/path/costsize.c
index b35acb7..fa4bebc 100644
--- a/src/backend/optimizer/path/costsize.c
+++ b/src/backend/optimizer/path/costsize.c
@@ -2185,15 +2185,6 @@ final_cost_nestloop(PlannerInfo *root, NestPath *path,
else
path->path.rows = path->path.parent->rows;
- /* For partial paths, scale row estimate. */
- if (path->path.parallel_workers > 0)
- {
- double parallel_divisor = get_parallel_divisor(&path->path);
-
- path->path.rows =
- clamp_row_est(path->path.rows / parallel_divisor);
- }
-
/*
* We could include disable_cost in the preliminary estimate, but that
* would amount to optimizing for the case where the join method is
@@ -2321,6 +2312,19 @@ final_cost_nestloop(PlannerInfo *root, NestPath *path,
ntuples = outer_path_rows * inner_path_rows;
}
+ /* We cannot emit more rows than we process. */
+ if (path->path.rows > ntuples)
+ path->path.rows = clamp_row_est(ntuples);
+
+ /* For partial paths, scale row estimate. */
+ if (path->path.parallel_workers > 0)
+ {
+ double parallel_divisor = get_parallel_divisor(&path->path);
+
+ path->path.rows =
+ clamp_row_est(path->path.rows / parallel_divisor);
+ }
+
/* CPU costs */
cost_qual_eval(&restrict_qual_cost, path->joinrestrictinfo, root);
startup_cost += restrict_qual_cost.startup;
join-card-est.sql
Description: application/sql
===========MASTER:
Hash Join (cost=3085.45..23948.76 rows=1000000 width=12) (actual
time=36.048..36.432 rows=4 loops=1)
Hash Cond: (aa1.b = bb.b)
-> Nested Loop (cost=1.46..34.85 rows=1000000 width=8) (actual
time=0.048..0.091 rows=4 loops=1)
-> Unique (cost=1.03..1.04 rows=2 width=4) (actual time=0.015..0.018
rows=2 loops=1)
-> Sort (cost=1.03..1.03 rows=2 width=4) (actual
time=0.015..0.016 rows=2 loops=1)
Sort Key: ai.a
Sort Method: quicksort Memory: 25kB
-> Seq Scan on ai (cost=0.00..1.02 rows=2 width=4)
(actual time=0.008..0.009 rows=2 loops=1)
-> Append (cost=0.42..16.89 rows=2 width=8) (actual
time=0.019..0.033 rows=2 loops=2)
-> Index Scan using aa1_pkey on aa1 (cost=0.42..8.44 rows=1
width=8) (actual time=0.018..0.020 rows=1 loops=2)
Index Cond: (a = ai.a)
-> Index Scan using aa2_pkey on aa2 (cost=0.42..8.44 rows=1
width=8) (actual time=0.011..0.011 rows=1 loops=2)
Index Cond: (a = ai.a)
-> Hash (cost=1443.00..1443.00 rows=100000 width=8) (actual
time=35.871..35.871 rows=100000 loops=1)
Buckets: 131072 Batches: 2 Memory Usage: 2976kB
-> Seq Scan on bb (cost=0.00..1443.00 rows=100000 width=8) (actual
time=0.008..15.230 rows=100000 loops=1)
Planning time: 0.334 ms
Execution time: 37.217 ms
===========PATCHED:
Nested Loop (cost=1.75..36.10 rows=1 width=12) (actual time=0.021..0.040
rows=4 loops=1)
-> Nested Loop (cost=1.46..34.85 rows=4 width=8) (actual time=0.017..0.029
rows=4 loops=1)
-> Unique (cost=1.03..1.04 rows=2 width=4) (actual time=0.008..0.009
rows=2 loops=1)
-> Sort (cost=1.03..1.03 rows=2 width=4) (actual
time=0.007..0.008 rows=2 loops=1)
Sort Key: ai.a
Sort Method: quicksort Memory: 25kB
-> Seq Scan on ai (cost=0.00..1.02 rows=2 width=4)
(actual time=0.003..0.004 rows=2 loops=1)
-> Append (cost=0.42..16.89 rows=2 width=8) (actual
time=0.005..0.009 rows=2 loops=2)
-> Index Scan using aa1_pkey on aa1 (cost=0.42..8.44 rows=1
width=8) (actual time=0.005..0.005 rows=1 loops=2)
Index Cond: (a = ai.a)
-> Index Scan using aa2_pkey on aa2 (cost=0.42..8.44 rows=1
width=8) (actual time=0.003..0.003 rows=1 loops=2)
Index Cond: (a = ai.a)
-> Index Scan using bb_pkey on bb (cost=0.29..0.31 rows=1 width=8) (actual
time=0.002..0.002 rows=1 loops=4)
Index Cond: (b = aa1.b)
Planning time: 0.222 ms
Execution time: 0.076 ms
-- Sent via pgsql-hackers mailing list ([email protected]) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
