[HACKERS] star schema and the optimizer

Marc Cousin Fri, 27 Feb 2015 03:30:26 -0800

Hi all,

I've been facing an issue with star schemas for a while. PostgreSQL's 
performance is not that good without rewriting the query (or at least I 
couldn't find a way to make it do what I want).


Here is a very basic mockup schema I used for the rest of this mail. It's of 
course too small to see very long runtimes (I see minutes, not tenths of 
seconds, with the schema that triggered this work):

create table dim1 (a int, b int);
create table dim2 (a int, b int);
insert into dim1 select i,(random()*1000)::int from generate_series(1,10000) 
g(i);
insert into dim2 select i,(random()*1000)::int from generate_series(1,10000) 
g(i);
create index idxdim11 on dim1(b);
create index idxdim12 on dim1(a);
create index idxdim21 on dim2(b);
create index idxdim22 on dim2(a);

create table facts (dim1 bigint, dim2 bigint, data1 text, data2 text, data3 
text, data4 text, data5 text, data6 text);
insert into facts select (random()*1000)::int, (random()*1000)::int,i,i,i,i,i 
from generate_series(1,10000000) g(i); 
create index idxfacts on facts(dim1,dim2);
analyze;


Here is the query that I want to make faster:

SELECT * FROM dim1,dim2,facts WHERE facts.dim1=dim1.a and facts.dim2=dim2.a and 
dim1.b=12 AND dim2.b=17;

PostgreSQL creates this plan:

 Nested Loop  (cost=233.98..207630.07 rows=8 width=99) (actual 
time=131.891..131.891 rows=0 loops=1)
   Join Filter: (facts.dim2 = dim2.a)
   Rows Removed by Join Filter: 239796
   ->  Index Scan using idxdim22 on dim2  (cost=0.29..343.29 rows=9 width=8) 
(actual time=0.672..4.436 rows=12 loops=1)
         Filter: (b = 17)
         Rows Removed by Filter: 9988
   ->  Materialize  (cost=233.70..206094.28 rows=9000 width=91) (actual 
time=0.471..6.673 rows=19983 loops=12)
         ->  Nested Loop  (cost=233.70..206049.28 rows=9000 width=91) (actual 
time=5.633..53.121 rows=19983 loops=1)
               ->  Index Scan using idxdim12 on dim1  (cost=0.29..343.29 rows=9 
width=8) (actual time=0.039..4.359 rows=10 loops=1)
                     Filter: (b = 12)
                     Rows Removed by Filter: 9990
               ->  Bitmap Heap Scan on facts  (cost=233.41..22756.32 rows=9990 
width=83) (actual time=1.113..4.146 rows=1998 loops=10)
                     Recheck Cond: (dim1 = dim1.a)
                     Heap Blocks: exact=19085
                     ->  Bitmap Index Scan on idxfacts  (cost=0.00..230.92 
rows=9990 width=0) (actual time=0.602..0.602 rows=1998 loops=10)
                           Index Cond: (dim1 = dim1.a)
 Planning time: 1.896 ms
 Execution time: 132.588 ms


What I used to do was to set join_collapse_limit to 1 and rewrite the query 
like this:

EXPLAIN ANALYZE SELECT * FROM dim1 cross join dim2 join facts on 
(facts.dim1=dim1.a and facts.dim2=dim2.a) where dim1.b=12 AND dim2.b=17;
                                                             QUERY PLAN         
                                                    
------------------------------------------------------------------------------------------------------------------------------------
 Nested Loop  (cost=13.25..3661.66 rows=8 width=99) (actual time=0.758..0.758 
rows=0 loops=1)
   ->  Nested Loop  (cost=8.71..57.82 rows=81 width=16) (actual 
time=0.065..0.151 rows=120 loops=1)
         ->  Bitmap Heap Scan on dim1  (cost=4.35..28.39 rows=9 width=8) 
(actual time=0.040..0.057 rows=10 loops=1)
               Recheck Cond: (b = 12)
               Heap Blocks: exact=10
               ->  Bitmap Index Scan on idxdim11  (cost=0.00..4.35 rows=9 
width=0) (actual time=0.033..0.033 rows=10 loops=1)
                     Index Cond: (b = 12)
         ->  Materialize  (cost=4.35..28.44 rows=9 width=8) (actual 
time=0.002..0.006 rows=12 loops=10)
               ->  Bitmap Heap Scan on dim2  (cost=4.35..28.39 rows=9 width=8) 
(actual time=0.021..0.040 rows=12 loops=1)
                     Recheck Cond: (b = 17)
                     Heap Blocks: exact=11
                     ->  Bitmap Index Scan on idxdim21  (cost=0.00..4.35 rows=9 
width=0) (actual time=0.017..0.017 rows=12 loops=1)
                           Index Cond: (b = 17)
   ->  Bitmap Heap Scan on facts  (cost=4.54..44.39 rows=10 width=83) (actual 
time=0.004..0.004 rows=0 loops=120)
         Recheck Cond: ((dim1 = dim1.a) AND (dim2 = dim2.a))
         ->  Bitmap Index Scan on idxfacts  (cost=0.00..4.53 rows=10 width=0) 
(actual time=0.004..0.004 rows=0 loops=120)
               Index Cond: ((dim1 = dim1.a) AND (dim2 = dim2.a))
 Planning time: 0.520 ms
 Execution time: 0.812 ms


The cost is 100 times lower. So this plan seems to be a good candidate. Or at 
least it keeps my users happy.



That rewriting worked for me, but today, I'm in a context where I cannot 
rewrite the query... it's generated.


So I gave a look at the optimizer's code to try to understand why I got this 
problem. If I understand correctly, the optimizer won't do cross joins, except 
if it has no choice.


A funny sidenote before continuing: having dim1.b = dim2.b gives the right plan 
too:

EXPLAIN ANALYZE SELECT * FROM dim1,dim2,facts WHERE facts.dim1=dim1.a and 
facts.dim2=dim2.a and dim1.b=12 AND dim2.b=12;
                                                             QUERY PLAN         
                                                     
-------------------------------------------------------------------------------------------------------------------------------------
 Nested Loop  (cost=9.43..954.86 rows=1 width=184) (actual time=1.409..1.409 
rows=0 loops=1)
   ->  Nested Loop  (cost=8.74..82.11 rows=100 width=32) (actual 
time=0.155..0.370 rows=70 loops=1)
         ->  Bitmap Heap Scan on dim1  (cost=4.37..40.42 rows=10 width=16) 
(actual time=0.093..0.144 rows=7 loops=1)
               Recheck Cond: (b = 12)
               Heap Blocks: exact=7
               ->  Bitmap Index Scan on idxdim11  (cost=0.00..4.37 rows=10 
width=0) (actual time=0.074..0.074 rows=7 loops=1)
                     Index Cond: (b = 12)
         ->  Materialize  (cost=4.37..40.47 rows=10 width=16) (actual 
time=0.009..0.020 rows=10 loops=7)
               ->  Bitmap Heap Scan on dim2  (cost=4.37..40.42 rows=10 
width=16) (actual time=0.051..0.096 rows=10 loops=1)
                     Recheck Cond: (b = 12)
                     Heap Blocks: exact=10
                     ->  Bitmap Index Scan on idxdim21  (cost=0.00..4.37 
rows=10 width=0) (actual time=0.038..0.038 rows=10 loops=1)
                           Index Cond: (b = 12)
   ->  Index Scan using idxfacts on facts  (cost=0.69..8.72 rows=1 width=152) 
(actual time=0.013..0.013 rows=0 loops=70)
         Index Cond: ((dim1 = dim1.a) AND (dim2 = dim2.a))
 Planning time: 2.616 ms
 Execution time: 1.625 ms
(17 lignes)


It seems logical: there is an EquivalenceClass between dim1 and dim2, so they 
can be joined, and the optimizer considers it, and chooses this plan.




Just commenting the tests in the planner (joinrels.c) solves my problem, but of 
course this makes the number of possible plans much higher. I attached a patch 
to this mail (as it is very small), not as a patch proposal of course, just to 
show what I have tested.

EXPLAIN ANALYZE SELECT * FROM dim1,dim2,facts WHERE facts.dim1=dim1.a and 
facts.dim2=dim2.a and dim1.b=12 AND dim2.b=17;
                                                            QUERY PLAN          
                                                   
-----------------------------------------------------------------------------------------------------------------------------------
 Nested Loop  (cost=13.32..7678.71 rows=17 width=99) (actual time=0.168..3.485 
rows=20 loops=1)
   ->  Nested Loop  (cost=8.79..70.60 rows=171 width=16) (actual 
time=0.107..0.452 rows=152 loops=1)
         ->  Bitmap Heap Scan on dim2  (cost=4.43..40.05 rows=19 width=8) 
(actual time=0.064..0.153 rows=19 loops=1)
               Recheck Cond: (b = 17)
               Heap Blocks: exact=18
               ->  Bitmap Index Scan on idxdim21  (cost=0.00..4.43 rows=19 
width=0) (actual time=0.042..0.042 rows=19 loops=1)
                     Index Cond: (b = 17)
         ->  Materialize  (cost=4.35..28.44 rows=9 width=8) (actual 
time=0.002..0.008 rows=8 loops=19)
               ->  Bitmap Heap Scan on dim1  (cost=4.35..28.39 rows=9 width=8) 
(actual time=0.034..0.062 rows=8 loops=1)
                     Recheck Cond: (b = 12)
                     Heap Blocks: exact=6
                     ->  Bitmap Index Scan on idxdim11  (cost=0.00..4.35 rows=9 
width=0) (actual time=0.023..0.023 rows=8 loops=1)
                           Index Cond: (b = 12)
   ->  Bitmap Heap Scan on facts  (cost=4.54..44.39 rows=10 width=83) (actual 
time=0.016..0.017 rows=0 loops=152)
         Recheck Cond: ((dim1 = dim1.a) AND (dim2 = dim2.a))
         Heap Blocks: exact=20
         ->  Bitmap Index Scan on idxfacts  (cost=0.00..4.53 rows=10 width=0) 
(actual time=0.014..0.014 rows=0 loops=152)
               Index Cond: ((dim1 = dim1.a) AND (dim2 = dim2.a))
 Planning time: 2.434 ms
 Execution time: 3.666 ms


So I get the plan I want, now.

I'm not advocating to apply this patch, of course. It would lead to bigger 
planning times for all other uses of PostgreSQL. That's the obvious reason why 
we don't inspect all those cross-joins.

So my questions are:

* Is this patch "correct", meaning that I could, if I had no other solution, 
make a patched version for this specific usage ? It passes all regression 
tests, and seems to return only valid plans.
* Could this be implemented, maybe with a GUC, similar to 
'constraint_exclusion' ? Some GUC telling "This user or database is used for 
datawarehousing, examine more plans, it's worth it" ? Is there something 
smarter that could be done ?

The real life difference here between the two plans is that the query runtime 
is going from 4 minutes to under a second.

Of course, I'm willing to work on this, if what I wrote on the rest of this 
email isn't completely false.

Regards,

Marc

diff --git a/src/backend/optimizer/path/joinrels.c 
b/src/backend/optimizer/path/joinrels.c
index e7e9a1a..8ec1af8 100644
--- a/src/backend/optimizer/path/joinrels.c
+++ b/src/backend/optimizer/path/joinrels.c
@@ -96,6 +96,10 @@ join_search_one_level(PlannerInfo *root, int level)
                        make_rels_by_clause_joins(root,
                                                                          
old_rel,
                                                                          
other_rels);
+                        /* If we are in a star schema we have to try 
everything */
+                       make_rels_by_clauseless_joins(root,
+                                                                               
  old_rel,
+                                                                               
  other_rels);
                }
                else
                {
@@ -147,10 +151,11 @@ join_search_one_level(PlannerInfo *root, int level)
                         * participate in join-order restrictions --- then we 
might have
                         * to force a bushy join plan.
                         */
+/*
                        if (old_rel->joininfo == NIL && 
!old_rel->has_eclass_joins &&
                                !has_join_restriction(root, old_rel))
                                continue;
-
+*/
                        if (k == other_level)
                                other_rels = lnext(r);  /* only consider 
remaining rels */
                        else
@@ -167,11 +172,15 @@ join_search_one_level(PlannerInfo *root, int level)
                                         * pair of rels.  Do so if there is at 
least one relevant
                                         * join clause or join order 
restriction.
                                         */
+/*
                                        if (have_relevant_joinclause(root, 
old_rel, new_rel) ||
                                                
have_join_order_restriction(root, old_rel, new_rel))
                                        {
+*/
                                                (void) make_join_rel(root, 
old_rel, new_rel);
+/*
                                        }
+*/
                                }
                        }
                }

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] star schema and the optimizer

Reply via email to