Hi buddies,

I've got a query as below, it runs several times with different execution plan 
and totally different execution time. The one using hash-join is slow and the 
one using semi-hash join is very fast. However, I have no control over the 
optimizer behavior of PostgreSQL database. Or, do I have?

The database version is 9.3.4

SELECT dem_type,
       dem_value,
       Count(*)
FROM   demo_weekly a
WHERE  date = '2013-11-30'
AND    userid IN ( select userid from test1)
       AND dem_type IN ( 'Gender', 'Age', 'Hobbies' )
GROUP  BY dem_type,
          dem_value ;


                                                                                
    QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
HashAggregate  (cost=322386.94..322786.94 rows=40000 width=29) (actual 
time=3142.849..3142.927 rows=19 loops=1)
   ->  Hash Semi Join  (cost=14460.06..314403.08 rows=1064514 width=29) (actual 
time=803.671..2786.979 rows=1199961 loops=1)
         Hash Cond: ((a.userid)::text = (test1.userid)::text)
         ->  Append  (cost=0.00..277721.30 rows=2129027 width=78) (actual 
time=536.829..1691.270 rows=2102611 loops=1)
               ->  Seq Scan on demo_weekly a  (cost=0.00..0.00 rows=1 
width=808) (actual time=0.002..0.002 rows=0 loops=1)
                     Filter: ((date = '2013-11-30'::date) AND ((dem_type)::text 
= ANY ('{Gender,Age,"Hobbies"}'::text[])))
               ->  Bitmap Heap Scan on demo_weekly_20131130 a_1  
(cost=50045.63..277721.30 rows=2129026 width=78) (actual time=536.826..1552.203 
rows=2102611 loops=1)
                     Recheck Cond: ((dem_type)::text = ANY 
('{Gender,Age,"Hobbies"}'::text[]))
                     Filter: (date = '2013-11-30'::date)
                     ->  Bitmap Index Scan on demo_weekly_20131130_dt_idx  
(cost=0.00..49513.37 rows=2129026 width=0) (actual time=467.453..467.453 
rows=2102611 loops=1)
                           Index Cond: ((dem_type)::text = ANY 
('{Gender,Age,"Hobbies"}'::text[]))
         ->  Hash  (cost=8938.36..8938.36 rows=441736 width=50) (actual 
time=266.501..266.501 rows=441736 loops=1)
               Buckets: 65536  Batches: 1  Memory Usage: 35541kB
               ->  Seq Scan on test1  (cost=0.00..8938.36 rows=441736 width=50) 
(actual time=0.023..87.869 rows=441736 loops=1)
Total runtime: 3149.004 ms
(15 rows)

                                                                                
   QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
HashAggregate  (cost=318351.90..318751.90 rows=40000 width=29) (actual 
time=23668.646..23668.723 rows=19 loops=1)
   ->  Hash Join  (cost=5316.68..310497.81 rows=1047212 width=29) (actual 
time=1059.182..23218.864 rows=1199961 loops=1)
         Hash Cond: ((a.userid)::text = (test1.userid)::text)
         ->  Append  (cost=0.00..276382.82 rows=2094423 width=78) (actual 
time=528.116..2002.462 rows=2102611 loops=1)
               ->  Seq Scan on demo_weekly a  (cost=0.00..0.00 rows=1 
width=808) (actual time=0.001..0.001 rows=0 loops=1)
                     Filter: ((date = '2013-11-30'::date) AND ((dem_type)::text 
= ANY ('{Gender,Age,"Hobbies"}'::text[])))
               ->  Bitmap Heap Scan on demo_weekly_20131130 a_1  
(cost=49269.46..276382.82 rows=2094422 width=78) (actual time=528.114..1825.265 
rows=2102611 loops=1)
                     Recheck Cond: ((dem_type)::text = ANY 
('{Gender,Age,"Hobbies"}'::text[]))
                     Filter: (date = '2013-11-30'::date)
                     ->  Bitmap Index Scan on demo_weekly_20131130_dt_idx  
(cost=0.00..48745.85 rows=2094422 width=0) (actual time=458.694..458.694 
rows=2102611 loops=1)
                           Index Cond: ((dem_type)::text = ANY 
('{Gender,Age,"Hobbies"}'::text[]))
         ->  Hash  (cost=5314.18..5314.18 rows=200 width=516) (actual 
time=530.930..530.930 rows=441736 loops=1)
               Buckets: 1024  Batches: 1  Memory Usage: 35541kB
               ->  HashAggregate  (cost=5312.18..5314.18 rows=200 width=516) 
(actual time=298.301..411.734 rows=441736 loops=1)
                     ->  Seq Scan on test1  (cost=0.00..5153.94 rows=63294 
width=516) (actual time=0.068..91.378 rows=441736 loops=1)
Total runtime: 23679.096 ms
(16 rows)

Reply via email to