Hello, I had a report that a query gets wired estimated row numbers and it makes subsequent plans wrong.
Although the detailed explanation is shown later in this mail, the problem here was that a query like following makes the path with apparently wrong row number. EXPLAIN SELECT a FROM (SELECT a FROM t1 UNION ALL SELECT 0) t JOIN (VALUES (1)) tt(x) ON tt.x = t.a; 0> Nested Loop (cost=0.43..8.51 rows=68732 width=4) 1> -> Values Scan on "*VALUES*" (cost=0.00..0.01 rows=1 width=4) 2> -> Append (cost=0.43..8.48 rows=2 width=4) 3> -> Index Only Scan using i_t1_a on t1 4> (cost=0.43..8.46 rows=1 width=4) 5> Index Cond: (a = "*VALUES*".column1) 6> -> Subquery Scan on "*SELECT* 2" (cost=0.00..0.02 rows=1 width=4) 7> Filter: ("*VALUES*".column1 = "*SELECT* 2"."?column?") 8> -> Result (cost=0.00..0.01 rows=1 width=0) The Nested Loop at line 0 says that it emits 68832 rows even though the underlying nodes, Values at line 1 and Append at line 2 are giving 1 and 2 respectively as their reults. This query actually returns only 1 row. It is seemingly wrong and causes the plans above go wrong. This is caused by the Append node, which don't have statistics, so eqjoinsel takes 0.5 as the default selectivity for the nested loop. Even though it is defficult to get statistics for Append node, the nested loop could get more sane row nubmer if it doesn't ignore the parameterized path selected for the scan on t1. I suppose that join nodes can safely clamp the number of its result rows by the product of the row numbers of underlying two paths. And if it is correct, the attached patch can correct the estimation like this. > Nested Loop (cost=0.43..16.51 rows=2 width=4) > -> Values Scan on "*VALUES*" (cost=0.00..0.01 rows=1 width=4) > -> Append (cost=0.43..16.48 rows=2 width=4) > -> Index Only Scan using i_t1_a on t1 > (cost=0.43..16.45 rows=1 width=4) > Index Cond: (a = "*VALUES*".column1) > -> Subquery Scan on "*SELECT* 2" (cost=0.00..0.02 rows=1 width=4) > Filter: ("*VALUES*".column1 = "*SELECT* 2"."?column?") > -> Result (cost=0.00..0.01 rows=1 width=0) Is it correct? And Does it have no bad side effects? And is this applicable for 9.5? The detailed test environment is described below. Before this fix, the inner path of the top-level join is doing hash because the inner path says that it will give 68732 rows. (But only 1 actually). After this fix, the outer path declaring that it gives 2 rows so the top-level join becomes nested loop and the inner path becomes index scan. ======= CREATE TABLE t1 (a int); INSERT INTO t1 (SELECT (a%2)*500000 + a / 2 FROM generate_series(0, 1000000) a); CREATE INDEX i_t1_a ON t1 (a); CREATE TABLE t2 AS SELECT * from t1; ANALYZE t1; ANALYZE t2; -- emulating the problematic situation SET join_collapse_limit TO 1; SET random_page_cost TO 8.0; SET seq_page_cost TO 0.5; EXPLAIN SELECT tt2.a FROM (SELECT a FROM (SELECT a FROM t1 UNION ALL SELECT 0) t JOIN (VALUES (1)) tt(x) ON tt.x = t.a) tt2 JOIN t2 ON (tt2.a = t2.a); ======== The plan before fix Hash Join (cost=30832.46..36230.60 rows=68732 width=4) (actual time=1258.880..1260.519 rows=1 loops=1) Hash Cond: ("*VALUES*".column1 = t2.a) -> Nested Loop (cost=0.43..8.51 rows=68732 width=8) (actual time=0.071..0.104 rows=1 loops=1) -> Values Scan on "*VALUES*" (cost=0.00..0.01 rows=1 width=4) (actual time=0.002..0.003 rows=1 loops=1) -> Append (cost=0.43..8.48 rows=2 width=4) (actual time=0.063..0.093 rows=1 loops=1) -> Index Only Scan using i_t1_a on t1 (cost=0.43..8.46 rows=1 width=4) (actual time=0.059..0.075 rows=1 loops=1) Index Cond: (a = "*VALUES*".column1) Heap Fetches: 2 -> Subquery Scan on "*SELECT* 2" (cost=0.00..0.02 rows=1 width=4) (actual time=0.010..0.010 rows=0 loops=1) Filter: ("*VALUES*".column1 = "*SELECT* 2"."?column?") Rows Removed by Filter: 1 -> Result (cost=0.00..0.01 rows=1 width=0) (actual time=0.001..0.002 rows=1 loops=1) -> Hash (cost=14425.01..14425.01 rows=1000001 width=4) (actual time=1250.274..1250.274 rows=1000001 loops=1) Buckets: 131072 Batches: 16 Memory Usage: 3227kB -> Seq Scan on t2 (cost=0.00..14425.01 rows=1000001 width=4) (actual time=0.021..608.245 rows=1000001 loops=1) Planning time: 0.319 ms Execution time: 1260.792 ms ======== The plan after fix Nested Loop (cost=0.86..17.43 rows=2 width=4) (actual time=0.103..0.118 rows=1 loops=1) Join Filter: ("*VALUES*".column1 = t2.a) -> Nested Loop (cost=0.43..16.51 rows=2 width=8) (actual time=0.062..0.073 rows=1 loops=1) -> Values Scan on "*VALUES*" (cost=0.00..0.01 rows=1 width=4) (actual time=0.003..0.004 rows=1 loops=1) -> Append (cost=0.43..16.48 rows=2 width=4) (actual time=0.051..0.060 rows=1 loops=1) -> Index Only Scan using i_t1_a on t1 (cost=0.43..16.45 rows=1 width=4) (actual time=0.045..0.046 rows=1 loops=1) Index Cond: (a = "*VALUES*".column1) Heap Fetches: 1 -> Subquery Scan on "*SELECT* 2" (cost=0.00..0.02 rows=1 width=4) (actual time=0.005..0.005 rows=0 loops=1) Filter: ("*VALUES*".column1 = "*SELECT* 2"."?column?") Rows Removed by Filter: 1 -> Result (cost=0.00..0.01 rows=1 width=0) (actual time=0.002..0.002 rows=1 loops=1) -> Index Only Scan using i_t2_a on t2 (cost=0.42..0.45 rows=1 width=4) (actual time=0.026..0.027 rows=1 loops=1) Index Cond: (a = t1.a) Heap Fetches: 1 Planning time: 0.968 ms Execution time: 0.239 ms ========= regards, -- Kyotaro Horiguchi NTT Open Source Software Center
>From 9c56e26130188d24a42469e18f3b6b7d84dbbc54 Mon Sep 17 00:00:00 2001 From: Kyotaro Horiguchi <horiguchi.kyot...@lab.ntt.co.jp> Date: Fri, 6 Mar 2015 16:17:12 +0900 Subject: [PATCH] Clamp row number of join product by the row number calculated from joining paths. When joins contains Appends containing paths selected parameterized paths, the estimated rows for the join becomes far larger than the product of the two row numbers of the joining paths. We can safely clamp the join's row number by the product. --- src/backend/optimizer/path/costsize.c | 24 ++++++++++++++++++++++++ 1 file changed, 24 insertions(+) diff --git a/src/backend/optimizer/path/costsize.c b/src/backend/optimizer/path/costsize.c index 78ef229..9a6643e 100644 --- a/src/backend/optimizer/path/costsize.c +++ b/src/backend/optimizer/path/costsize.c @@ -1855,6 +1855,14 @@ final_cost_nestloop(PlannerInfo *root, NestPath *path, ntuples = outer_path_rows * inner_path_rows; } + /* + * In some cases the path has the row number of cartesian product without + * any restriction on the joining tables. We can safely clamp the number + * by ntuples. + */ + if (path->path.rows > ntuples) + path->path.rows = ntuples; + /* CPU costs */ cost_qual_eval(&restrict_qual_cost, path->joinrestrictinfo, root); startup_cost += restrict_qual_cost.startup; @@ -2293,6 +2301,14 @@ final_cost_mergejoin(PlannerInfo *root, MergePath *path, else run_cost += bare_inner_cost; + /* + * In some cases the path has the row number of cartesian product without + * any restriction on the joining tables. We can safely clamp the number + * by mergejointuples. See approx_tuple_count about mergejointuples. + */ + if (path->jpath.path.rows > mergejointuples) + path->jpath.path.rows = mergejointuples; + /* CPU costs */ /* @@ -2691,6 +2707,14 @@ final_cost_hashjoin(PlannerInfo *root, HashPath *path, } /* + * In some cases the path has the row number of cartesian product without + * any restriction on the joining tables. We can safely clamp the number + * by hashjointuples. See approx_tuple_count about hashjointuples. + */ + if (path->jpath.path.rows > hashjointuples) + path->jpath.path.rows = hashjointuples; + + /* * For each tuple that gets through the hashjoin proper, we charge * cpu_tuple_cost plus the cost of evaluating additional restriction * clauses that are to be applied at the join. (This is pessimistic since -- 2.1.0.GIT
-- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers