[HACKERS] The rewritting of join conditions caused a very slow query plan.

chang chao Mon, 16 May 2016 01:24:08 -0700

Hi,all

I have a query that is very slow,and the reason may be in the rewritting of 
join conditions.


this is the simplied version table and the key part of the sql.

level1_table and level2_table hold the tree data nodes,
and all_level_status table holds the current status all all nodes of all levels.
(I know that there would be much less trouble in performance if 
all_level_status was divided into two tables,namely,level1_status and 
level2_status tables.)

table1: level1_table
  level1_no   PK:serial
  level1_node_name :varchar

table2:level2_table
  level2_no   PK:serial
  parent_no   FK to level1_table.level1_no
  level2_node_name :varchar

table3: all_level_status
  level:1 OR 2 PK1
  node_no:level1_table.level1_no or level2_table.level2_no PK2
  status:0 OR 1(normal or abnormal)


The sql to find all level2 nodes whose parent level nodes are in normal status.

explain analyze
select * from level2_table l2
join (
 select l1.* from level1_table l1
 join all_level_status als on (als.level=1 and als.node_no=l1.level1_no)
 where  als.status=0
) normal_l1 on l2.parent_no=normal_l1.level1_no;


this is the query plan .

"Merge Join  (cost=3.38..5.13 rows=3 width=158) (actual time=0.087..0.179 
rows=21 loops=1)"
"  Merge Cond: (als.node_no = l2.parent_no)"
"  ->  Merge Join  (cost=1.63..7.66 rows=19 width=80) (actual time=0.067..0.126 
rows=18 loops=1)"
"        Merge Cond: (als.node_no = l1.level1_no)"
"        ->  Index Scan using all_level_status_pkey on all_level_status als  
(cost=0.00..21.74 rows=19 width=4) (actual time=0.037..0.079 rows=18 loops=1)"
"              Index Cond: (level = 1)"
"              Filter: (status = 0)"
"        ->  Sort  (cost=1.63..1.68 rows=20 width=76) (actual time=0.026..0.026 
rows=20 loops=1)"
"              Sort Key: l1.level1_no"
"              Sort Method:  quicksort  Memory: 27kB"
"              ->  Seq Scan on level1_table l1  (cost=0.00..1.20 rows=20 
width=76) (actual time=0.005..0.009 rows=20 loops=1)"
"  ->  Sort  (cost=1.75..1.81 rows=23 width=82) (actual time=0.016..0.024 
rows=23 loops=1)"
"        Sort Key: l2.parent_no"
"        Sort Method:  quicksort  Memory: 28kB"
"        ->  Seq Scan on level2_table l2  (cost=0.00..1.23 rows=23 width=82) 
(actual time=0.003..0.005 rows=23 loops=1)"
"Total runtime: 0.307 ms"


Please note that,join condition of query plan line 2 is rewritten to 
"als.node_no = l2.parent_no"
level1 and level2 nodes are of the 1:n relationship,and because 
all_level_status.node_no represents different things(level1_table.level1_no and 
level2_table.level2_no uses separate serials),so when this rewriting is 
applied,the statistics of mcvs of all_level_status.node_no and level2.parent_no 
will be used to do the row selectivity,as can be anticipated,a large gap 
occurred between actual rows and estimated rows.

the above sql is one simplified part of a long sql,because of this gap,the 
estimated row count becomes 1 in the outer sub-query,which in actual has large 
number of values,
the very slow nested-loop join is selected.

Had the rewriting of the join condition not be done,maybe a much fast query 
plan would be selected.

So I'm wondering what is the reason behind the join condition rewriting,
Is it just because that join conditions that both left and right side have mcvs 
are preferable to those in which there are no mcvs on both sides?


Chao.

[HACKERS] The rewritting of join conditions caused a very slow query plan.

Reply via email to