I'm trying to port our application from MS-SQL to Postgres. We have implemented all of our rather complicated application security in the database. The query that follows takes a half of a second or less on MS-SQL server and around 5 seconds on Postgres. My concern is that this data set is rather "small" by our applications standards. It is not unusual for the da_answer table to have 2-4 million records. I'm worried that if this very small data set is taking 5 seconds, then a "regular sized" data set will take far too long.
I originally thought the NOT EXISTS on the "da_data_restrict_except_open" table was killing performance, but the query took the exact same amount of time after I deleted all rows from this table. Note that the hard-coded 999999999.0, and 4000 parameters, as well as the parameter to svp_getparentproviders are the three variables that change from one run of this query to the next. I'm using Postgres 7.4.5 as packaged in Debian. shared_buffers is set to 57344 and sort_mem=4096. The machine has an AMD 1.8+ and ` gig of RAM. Here are some relevant performance statistics: richard:/usr/share/cups/model# cat /proc/sys/kernel/shmmax 536870912 richard:/usr/share/cups/model# cat /proc/sys/kernel/shmall 536870912 richard:/home/richard# hdparm -tT /dev/hda Timing cached reads: 1112 MB in 2.00 seconds = 556.00 MB/sec Timing buffered disk reads: 176 MB in 3.02 seconds = 58.28 MB/sec I have included an EXPLAIN ANALYZE, relevant table counts, and relevant indexing information. If anyone has any suggestions on how to improve performance.... TIA! SELECT tab.answer_id, client_id, question_id, recordset_id, date_effective, virt_field_name FROM ( SELECT a.uid AS answer_id, a.client_id, a.question_id, recordset_id, date_effective FROM da_answer a WHERE a.date_effective <= 9999999999.0 AND a.inactive != 1 AND ( 5000 = 4000 OR (EXISTS (SELECT * FROM svp_getparentproviderids(1) WHERE svp_getparentproviderids = a.provider_id)) ) UNION SELECT a.uid AS answer_id, a.client_id, a.question_id, recordset_id, date_effective FROM da_answer a, ( SELECT main_id FROM da_data_restrict WHERE type_id = 2 AND (provider_id IN (SELECT * FROM svp_getparentproviderids(1))) UNION SELECT sa.uid AS main_id FROM da_answer sa JOIN da_data_restrict_except_closed dr ON dr.main_id = sa.uid AND dr.type_id = 2 AND dr.except_provider_id = 1 WHERE (restricted = 1) AND (restricted_closed_except = 1) AND sa.covered_by_roi = 1 UNION SELECT sa.uid AS main_id FROM da_answer sa WHERE (restricted = 0) AND (restricted_open_except = 1) AND (NOT EXISTS (SELECT dr.main_id FROM da_data_restrict_except_open dr WHERE (dr.main_id = sa.uid) AND (dr.type_id = 2) AND (dr.except_provider_id in (select * from svp_getparentproviderids(1))))) AND sa.covered_by_roi = 1 UNION SELECT sa.uid AS main_id FROM da_answer sa WHERE (restricted = 0) AND (restricted_open_except = 0) AND sa.covered_by_roi = 1 ) sec WHERE a.covered_by_roi = 1 AND a.date_effective <= 9999999999.0 AND a.inactive != 1 AND a.uid = sec.main_id AND 5000 > 4000 ) tab, da_question q WHERE tab.question_id = q.uid AND (min_access_level <= 4000 OR min_access_level IS NULL) Table counts from relevant tables da_question 1095 da_answer 21117 da_question 1095 da_data_restrict_except_closed 3087 da_data_restrict_except_open 13391 svp_getparentproviderids(1) 1 Relevant Index create index in_da_data_restrict_provider_id on da_data_restrict(provider_id); create index in_da_data_restrict_main_id on da_data_restrict(main_id); create index in_da_data_restrict_type_id on da_data_restrict(type_id); create index in_da_data_restrict_client_id on da_data_restrict(client_id); create index in_da_dr_type_provider on da_data_restrict(type_id,provider_id); create index in_da_data_rec_provider_id ON da_data_restrict_except_closed(provider_id); create index in_da_data_rec_type_id ON da_data_restrict_except_closed(type_id); create index in_da_data_rec_main_id ON da_data_restrict_except_closed(main_id); create index in_da_data_rec_except_provider_id ON da_data_restrict_except_closed(except_provider_id); create index in_da_data_reo_provider_id ON da_data_restrict_except_open(provider_id); create index in_da_data_reo_type_id ON da_data_restrict_except_open(type_id); create index in_da_data_reo_main_id ON da_data_restrict_except_open(main_id); create index in_da_data_reo_except_provider_id ON da_data_restrict_except_open(except_provider_id); create index in_da_answer_client_id ON da_answer(client_id); create index in_da_answer_provider_id ON da_answer(provider_id); create index in_da_answer_question_id ON da_answer(question_id); create index in_da_answer_recordset_id ON da_answer(recordset_id); create index in_da_answer_restricted ON da_answer(restricted); create index in_da_answer_restricted_open_except ON da_answer(restricted_open_except); create index in_da_answer_restricted_closed_except ON da_answer(restricted_closed_except); create index in_da_answer_date_effective ON da_answer(date_effective); create index in_da_answer_inactive ON da_answer(inactive); create index in_da_answer_covered_by_roi ON da_answer(covered_by_roi); create index in_da_ed_inactive_roi ON da_answer(date_effective,inactive, covered_by_roi); create index in_da_question_mal ON da_question(min_access_level);
QUERY PLAN ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- Hash Join (cost=1054186.23..1054631.36 rows=4496 width=67) (actual time=4902.250..4979.060 rows=7653 loops=1) Hash Cond: ("outer".question_id = "inner".uid) -> Subquery Scan tab (cost=1054123.62..1054457.09 rows=13339 width=24) (actual time=4896.963..4933.896 rows=7657 loops=1) -> Unique (cost=1054123.62..1054323.70 rows=13339 width=24) (actual time=4896.948..4915.498 rows=7657 loops=1) -> Sort (cost=1054123.62..1054156.96 rows=13339 width=24) (actual time=4896.944..4903.402 rows=7717 loops=1) Sort Key: answer_id, client_id, question_id, recordset_id, date_effective -> Append (cost=0.00..1053209.67 rows=13339 width=24) (actual time=279.091..4841.605 rows=7717 loops=1) -> Subquery Scan "*SELECT* 1" (cost=0.00..64034.15 rows=10540 width=24) (actual time=279.089..4419.371 rows=161 loops=1) -> Seq Scan on da_answer a (cost=0.00..63928.75 rows=10540 width=24) (actual time=279.080..4418.808 rows=161 loops=1) Filter: ((date_effective <= 9999999999::double precision) AND (inactive <> 1) AND (subplan)) SubPlan -> Function Scan on svp_getparentproviderids (cost=0.00..15.00 rows=5 width=4) (actual time=0.203..0.203 rows=0 loops=21089) Filter: (svp_getparentproviderids = $1) -> Subquery Scan "*SELECT* 2" (cost=988627.58..989175.52 rows=2799 width=24) (actual time=290.730..417.720 rows=7556 loops=1) -> Hash Join (cost=988627.58..989147.53 rows=2799 width=24) (actual time=290.722..395.739 rows=7556 loops=1) Hash Cond: ("outer".main_id = "inner".uid) -> Subquery Scan sec (cost=987913.23..988002.59 rows=5957 width=4) (actual time=203.862..225.462 rows=7567 loops=1) -> Unique (cost=987913.23..987943.02 rows=5957 width=4) (actual time=203.851..215.834 rows=7567 loops=1) -> Sort (cost=987913.23..987928.12 rows=5957 width=4) (actual time=203.843..207.273 rows=7567 loops=1) Sort Key: main_id -> Append (cost=160.61..987539.72 rows=5957 width=4) (actual time=35.798..187.293 rows=7567 loops=1) -> Subquery Scan "*SELECT* 1" (cost=160.61..164.53 rows=46 width=4) (actual time=35.796..35.923 rows=35 loops=1) -> Hash Join (cost=160.61..164.07 rows=46 width=4) (actual time=35.791..35.868 rows=35 loops=1) Hash Cond: ("outer".svp_getparentproviderids = "inner".provider_id) -> HashAggregate (cost=15.00..15.00 rows=200 width=4) (actual time=0.491..0.492 rows=1 loops=1) -> Function Scan on svp_getparentproviderids (cost=0.00..12.50 rows=1000 width=4) (actual time=0.473..0.474 rows=1 loops=1) -> Hash (cost=145.50..145.50 rows=46 width=8) (actual time=35.256..35.256 rows=0 loops=1) -> Index Scan using in_da_dr_type_provider on da_data_restrict (cost=0.00..145.50 rows=46 width=8) (actual time=0.041..26.627 rows=7280 loops=1) Index Cond: (type_id = 2) -> Subquery Scan "*SELECT* 2" (cost=0.00..53.69 rows=1 width=4) (actual time=0.030..0.030 rows=0 loops=1) -> Nested Loop (cost=0.00..53.68 rows=1 width=4) (actual time=0.028..0.028 rows=0 loops=1) -> Index Scan using in_da_data_rec_except_provider_id on da_data_restrict_except_closed dr (cost=0.00..50.65 rows=1 width=4) (actual time=0.026..0.026 rows=0 loops=1) Index Cond: (except_provider_id = 1) Filter: (type_id = 2) -> Index Scan using da_answer_pkey on da_answer sa (cost=0.00..3.02 rows=1 width=4) (never executed) Index Cond: ("outer".main_id = sa.uid) Filter: ((restricted = 1) AND (restricted_closed_except = 1) AND (covered_by_roi = 1)) -> Subquery Scan "*SELECT* 3" (cost=0.00..986638.62 rows=678 width=4) (actual time=0.346..77.393 rows=1841 loops=1) -> Index Scan using in_da_answer_restricted_open_except on da_answer sa (cost=0.00..986631.84 rows=678 width=4) (actual time=0.342..74.614 rows=1841 loops=1) Index Cond: (restricted_open_except = 1) Filter: ((restricted = 0) AND (covered_by_roi = 1) AND (NOT (subplan))) SubPlan -> Nested Loop IN Join (cost=0.00..227.09 rows=1 width=4) (actual time=0.026..0.026 rows=0 loops=1841) Join Filter: ("outer".except_provider_id = "inner".svp_getparentproviderids) -> Index Scan using in_da_data_reo_main_id on da_data_restrict_except_open dr (cost=0.00..212.09 rows=1 width=8) (actual time=0.009..0.014 rows=2 loops=1841) Index Cond: (main_id = $0) Filter: (type_id = 2) -> Function Scan on svp_getparentproviderids (cost=0.00..12.50 rows=1000 width=4) (actual time=0.002..0.003 rows=1 loops=3793) -> Subquery Scan "*SELECT* 4" (cost=0.00..682.87 rows=5232 width=4) (actual time=0.064..69.107 rows=5691 loops=1) -> Seq Scan on da_answer sa (cost=0.00..630.55 rows=5232 width=4) (actual time=0.059..61.052 rows=5691 loops=1) Filter: ((restricted = 0) AND (restricted_open_except = 0) AND (covered_by_roi = 1)) -> Hash (cost=630.55..630.55 rows=9922 width=24) (actual time=86.699..86.699 rows=0 loops=1) -> Seq Scan on da_answer a (cost=0.00..630.55 rows=9922 width=24) (actual time=0.043..73.232 rows=10062 loops=1) Filter: ((covered_by_roi = 1) AND (date_effective <= 9999999999::double precision) AND (inactive <> 1)) -> Hash (cost=61.69..61.69 rows=369 width=47) (actual time=5.241..5.241 rows=0 loops=1) -> Seq Scan on da_question q (cost=0.00..61.69 rows=369 width=47) (actual time=0.026..4.071 rows=1087 loops=1) Filter: ((min_access_level <= 4000) OR (min_access_level IS NULL)) Total runtime: 4986.508 ms (58 rows) count ------- 21117 (1 row)
---------------------------(end of broadcast)--------------------------- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match