[PERFORM] avoiding seqscan?

Palle Girgensohn Sun, 28 Sep 2003 13:57:14 -0700

Hi!

I have a SQL statement that I cannot get to use the index. postgresql insists on using a seqscan and performance is very poor. set enable_seqscan = true boost performance drastically, as you can see below. Since seqscan is not always bad, I'd rather not turn it off completely, but rather get the planner to do the right thing here. Is there another way to do this, apart from setting enable_seqscan=false?

Thanks,
Palle


the tables are:
person  with about 30000 tuples
dyn_field_content_person, 331156 tuples
dyn_field_person, just 15 tuples
course about 700 tuples
partitcipant with ~ 7800 tuples, where ~ 60 have course_id=707...

uu=# explain analyze uu-# select lower(substr(p.last_name,1,1)) as letter, count(*) uu-# FROM course c join group_data gd on (c.active_group_id = gd.this_group_id) uu-# join person p on (gd.item_text = p.userid) uu-# join dyn_field_person dfp on (dfp.extern_item_id = 10 and dfp.giver=c.giver) uu-# join dyn_field_content_person dfcp on (dfp.id = dfcp.dyn_field_id and dfcp.userid=p.userid) uu-# left outer join participant pt on (pt.userid = p.userid and pt.course_id = 707) uu-# WHERE c.id = 707 uu-# group by 1 uu-# ;

QUERY PLAN

--------------------------------------------------------------------------- --------------------------------------------------------------------------- --------------------------------------- ------------------- Aggregate (cost=10496.30..10498.35 rows=27 width=106) (actual time=4166.01..4167.23 rows=19 loops=1) -> Group (cost=10496.30..10497.67 rows=273 width=106) (actual time=4165.92..4166.80 rows=60 loops=1) -> Sort (cost=10496.30..10496.98 rows=273 width=106) (actual time=4165.91..4166.10 rows=60 loops=1) Sort Key: lower(substr(p.last_name, 1, 1)) -> Merge Join (cost=10443.75..10485.23 rows=273 width=106) (actual time=4094.42..4165.20 rows=60 loops=1) Merge Cond: ("outer".userid = "inner".userid) Join Filter: ("inner".course_id = 707) -> Sort (cost=9803.86..9804.54 rows=273 width=88) (actual time=3823.78..3823.97 rows=60 loops=1) Sort Key: dfcp.userid -> Hash Join (cost=2444.22..9792.79 rows=273 width=88) (actual time=1140.50..3822.60 rows=60 loops=1) Hash Cond: ("outer".userid = "inner".item_text) Join Filter: ("inner".id = "outer".dyn_field_id) -> Seq Scan on dyn_field_content_person dfcp (cost=0.00..5643.56 rows=331156 width=16) (actual time=0.01..2028.31 rows=331156 loops=1) -> Hash (cost=2443.54..2443.54 rows=272 width=72) (actual time=340.24..340.24 rows=0 loops=1) -> Nested Loop (cost=1401.84..2443.54 rows=272 width=72) (actual time=338.76..339.91 rows=60 loops=1) Join Filter: ("outer".giver = "inner".giver) -> Seq Scan on dyn_field_person dfp (cost=0.00..1.19 rows=1 width=16) (actual time=0.06..0.09 rows=1 loops=1) Filter: (extern_item_id = 10) -> Materialize (cost=2437.67..2437.67 rows=374 width=56) (actual time=338.64..338.82 rows=60 loops=1) -> Hash Join (cost=1401.84..2437.67 rows=374 width=56) (actual time=7.74..338.36 rows=60 loops=1) Hash Cond: ("outer".userid = "inner".item_text) -> Seq Scan on person p (cost=0.00..806.09 rows=30009 width=23) (actual time=0.01..203.67 rows=30009 loops=1) -> Hash (cost=1400.89..1400.89 rows=378 width=33) (actual time=1.60..1.60 rows=0 loops=1) -> Nested Loop (cost=0.00..1400.89 rows=378 width=33) (actual time=0.12..1.28 rows=60 loops=1) -> Index Scan using course_pkey on course c (cost=0.00..5.08 rows=1 width=16) (actual time=0.06..0.06 rows=1 loops=1)

Index Cond: (id = 707) -> Index Scan using group_data_this_idx on group_data gd (cost=0.00..1390.80 rows=402 width=17) (actual time=0.04..0.6 6 rows=60 loops=1)

Index Cond: ("outer".active_group_id = gd.this_group_id) -> Sort (cost=639.90..659.42 rows=7808 width=18) (actual time=266.55..290.81 rows=7722 loops=1) Sort Key: pt.userid -> Seq Scan on participant pt (cost=0.00..135.08 rows=7808 width=18) (actual time=0.02..50.24 rows=7808 loops=1) Total runtime: 4170.16 msec (32 rader)

Tid: 4184,68 ms uu=# set enable_seqscan = false; SET Tid: 1,20 ms uu=# explain analyze uu-# select lower(substr(p.last_name,1,1)) as letter, count(*) uu-# FROM course c join group_data gd on (c.active_group_id = gd.this_group_id) uu-# join person p on (gd.item_text = p.userid) uu-# join dyn_field_person dfp on (dfp.extern_item_id = 10 and dfp.giver=c.giver) uu-# join dyn_field_content_person dfcp on (dfp.id = dfcp.dyn_field_id and dfcp.userid=p.userid) uu-# left outer join participant pt on (pt.userid = p.userid and pt.course_id = 707) uu-# WHERE c.id = 707 uu-# group by 1 uu-# ;

QUERY PLAN

--------------------------------------------------------------------------- --------------------------------------------------------------------------- --------------------------------------- --------- Aggregate (cost=17928.32..17930.37 rows=27 width=106) (actual time=171.37..172.58 rows=19 loops=1) -> Group (cost=17928.32..17929.68 rows=273 width=106) (actual time=171.27..172.14 rows=60 loops=1) -> Sort (cost=17928.32..17929.00 rows=273 width=106) (actual time=171.26..171.45 rows=60 loops=1) Sort Key: lower(substr(p.last_name, 1, 1)) -> Merge Join (cost=17545.53..17917.25 rows=273 width=106) (actual time=36.64..170.53 rows=60 loops=1) Merge Cond: ("outer".userid = "inner".userid) Join Filter: ("inner".course_id = 707) -> Sort (cost=17545.53..17546.22 rows=273 width=88) (actual time=28.62..28.84 rows=60 loops=1) Sort Key: dfcp.userid -> Nested Loop (cost=0.00..17534.46 rows=273 width=88) (actual time=7.99..27.49 rows=60 loops=1) Join Filter: ("outer".id = "inner".dyn_field_id) -> Nested Loop (cost=0.00..3685.31 rows=272 width=72) (actual time=7.67..8.95 rows=60 loops=1) Join Filter: ("outer".giver = "inner".giver) -> Index Scan using df_person_giver_id_idx on dyn_field_person dfp (cost=0.00..6.20 rows=1 width=16) (actual time=0.14..0.17 rows=1 loops=1) Filter: (extern_item_id = 10) -> Materialize (cost=3674.43..3674.43 rows=374 width=56) (actual time=7.49..7.69 rows=60 loops=1) -> Nested Loop (cost=0.00..3674.43 rows=374 width=56) (actual time=0.24..7.22 rows=60 loops=1) -> Nested Loop (cost=0.00..1400.89 rows=378 width=33) (actual time=0.10..1.34 rows=60 loops=1) -> Index Scan using course_pkey on course c (cost=0.00..5.08 rows=1 width=16) (actual time=0.04..0.05 rows=1 loops=1) Index Cond: (id = 707) -> Index Scan using group_data_this_idx on group_data gd (cost=0.00..1390.80 rows=402 width=17) (actual time=0.04..0.70 rows=60 lo ops=1) Index Cond: ("outer".active_group_id = gd.this_group_id) -> Index Scan using person_pkey on person p (cost=0.00..6.01 rows=1 width=23) (actual time=0.07..0.08 rows=1 loops=60) Index Cond: ("outer".item_text = p.userid) -> Index Scan using df_content_person_userid_id_idx on dyn_field_content_person dfcp (cost=0.00..50.75 rows=12 width=16) (actual time=0.08..0.23 rows=11 l oops=60) Index Cond: (dfcp.userid = "outer".item_text) -> Index Scan using participant_uid_cid_idx on participant pt (cost=0.00..349.76 rows=7808 width=18) (actual time=0.07..84.34 rows=7722 loops=1) Total runtime: 173.37 msec (28 rader)

Tid: 183,37 ms


---------------------------(end of broadcast)---------------------------
TIP 9: the planner will ignore your desire to choose an index scan if your
     joining column's datatypes do not match

[PERFORM] avoiding seqscan?

Reply via email to