Tom Lane wrote:
Ron Mayer <[EMAIL PROTECTED]> writes:
Would another possible condition for considering
Cartesian joins be be:

   * Consider Cartesian joins when a unique constraint can prove
     that at most one row will be pulled from one of the tables
     that would be part of this join?

What for?  That would still lead us to consider large numbers of totally
useless joins.

                        regards, tom lane

Often I get order-of-magnitude better queries by forcing the cartesian
join even without multi-column indexes.

Explain analyze results below.



Here's an example with your typical star schema.
  fact is the central fact table.
  d_ref is a dimension table for the referrer
  d_uag is a dimension table for the useragent.

Forcing the cartesan join using "offset 0" makes
the the query take 14 ms (estimated cost 7575).

If I don't force the cartesian join the query takes
over 100ms (estimated cost 398919).

Indexes are on each dimension; but no multi-column
indexes (since the ad-hoc queries can hit any permutation
of dimensions).

logs=# explain analyze select * from fact natural join (select * from d_ref natural join d_uag where ref_host = 'www.real.com' 
and ref_path = '/products/player/more_info/moreinfo.html' and ref_query = 
'?ID=370&DC=&LANG=&PN=RealOne%20Player&PV=6.0.11.818&PT=&OS=&CM=&CMV=&LS=&RE=&RA=&RV='
 and useragent = 'Mozilla/4.08 [en] (WinNT; U ;Nav)' offset 0 ) as a;
                                                                                
                                                  QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Nested Loop  (cost=6465.12..7575.91 rows=367 width=2096) (actual 
time=14.152..14.192 rows=4 loops=1)
   ->  Limit  (cost=0.00..14.22 rows=1 width=218) (actual time=0.084..0.102 
rows=1 loops=1)
         ->  Nested Loop  (cost=0.00..14.22 rows=1 width=218) (actual 
time=0.082..0.096 rows=1 loops=1)
               ->  Index Scan using i_ref__val on d_ref  (cost=0.00..7.83 
rows=1 width=127) (actual time=0.056..0.058 rows=1 loops=1)
                     Index Cond: (((ref_path)::text = '/products/player/more_info/moreinfo.html'::text) AND ((ref_host)::text = 
'www.real.com'::text) AND ((ref_query)::text = 
'?ID=370&DC=&LANG=&PN=RealOne%20Player&PV=6.0.11.818&PT=&OS=&CM=&CMV=&LS=&RE=&RA=&RV='::text))
               ->  Index Scan using i_uag__val on d_uag  (cost=0.00..6.38 
rows=1 width=91) (actual time=0.020..0.029 rows=1 loops=1)
                     Index Cond: ((useragent)::text = 'Mozilla/4.08 [en] 
(WinNT; U ;Nav)'::text)
   ->  Bitmap Heap Scan on fact  (cost=6465.12..7556.18 rows=367 width=32) 
(actual time=14.053..14.066 rows=4 loops=1)
         Recheck Cond: ((fact.uag_id = a.uag_id) AND (fact.ref_id = a.ref_id))
         ->  BitmapAnd  (cost=6465.12..6465.12 rows=367 width=0) (actual 
time=14.016..14.016 rows=0 loops=1)
               ->  Bitmap Index Scan on i__fact__uag_id  (cost=0.00..2770.83 
rows=196223 width=0) (actual time=2.258..2.258 rows=7960 loops=1)
                     Index Cond: (fact.uag_id = a.uag_id)
               ->  Bitmap Index Scan on i__fact__ref_id  (cost=0.00..3581.50 
rows=253913 width=0) (actual time=9.960..9.960 rows=13751 loops=1)
                     Index Cond: (fact.ref_id = a.ref_id)
 Total runtime: 14.332 ms
(15 rows)

logs=#



logs=# explain analyze select * from fact natural join (select * from d_ref natural join d_uag where ref_host = 'www.real.com' 
and ref_path = '/products/player/more_info/moreinfo.html' and ref_query = 
'?ID=370&DC=&LANG=&PN=RealOne%20Player&PV=6.0.11.818&PT=&OS=&CM=&CMV=&LS=&RE=&RA=&RV='
 and useragent = 'Mozilla/4.08 [en] (WinNT; U ;Nav)' ) as a;
                                                                                
                                               QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Hash Join  (cost=2827.72..398919.05 rows=1 width=242) (actual 
time=78.777..107.038 rows=4 loops=1)
   Hash Cond: (fact.ref_id = d_ref.ref_id)
   ->  Nested Loop  (cost=2819.88..398908.65 rows=511 width=119) (actual 
time=6.311..101.843 rows=7960 loops=1)
         ->  Index Scan using i_uag__val on d_uag  (cost=0.00..6.38 rows=1 
width=91) (actual time=0.021..0.029 rows=1 loops=1)
               Index Cond: ((useragent)::text = 'Mozilla/4.08 [en] (WinNT; U 
;Nav)'::text)
         ->  Bitmap Heap Scan on fact  (cost=2819.88..396449.49 rows=196223 
width=32) (actual time=6.273..91.645 rows=7960 loops=1)
               Recheck Cond: (fact.uag_id = d_uag.uag_id)
               ->  Bitmap Index Scan on i__fact__uag_id  (cost=0.00..2770.83 
rows=196223 width=0) (actual time=5.117..5.117 rows=7960 loops=1)
                     Index Cond: (fact.uag_id = d_uag.uag_id)
   ->  Hash  (cost=7.83..7.83 rows=1 width=127) (actual time=0.069..0.069 
rows=1 loops=1)
         ->  Index Scan using i_ref__val on d_ref  (cost=0.00..7.83 rows=1 
width=127) (actual time=0.059..0.062 rows=1 loops=1)
               Index Cond: (((ref_path)::text = '/products/player/more_info/moreinfo.html'::text) AND ((ref_host)::text = 
'www.real.com'::text) AND ((ref_query)::text = 
'?ID=370&DC=&LANG=&PN=RealOne%20Player&PV=6.0.11.818&PT=&OS=&CM=&CMV=&LS=&RE=&RA=&RV='::text))
 Total runtime: 107.193 ms
(13 rows)

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Reply via email to