I'm trying to port our application from MS-SQL to Postgres.  We have
implemented all of our rather complicated application security in the
database.  The query that follows takes a half of a second or less on
MS-SQL server and around 5 seconds on Postgres.  My concern is that this
data set is rather "small" by our applications standards.  It is not
unusual for the da_answer table to have 2-4 million records.  I'm
worried that if this very small data set is taking 5 seconds, then a
"regular sized" data set will take far too long.

I originally thought the NOT EXISTS on the
"da_data_restrict_except_open" table was killing performance, but the
query took the exact same amount of time after I deleted all rows from
this table.  Note that the hard-coded 999999999.0, and 4000 parameters,
as well as the parameter to svp_getparentproviders are the three
variables that change from one run of this query to the next.

I'm using Postgres 7.4.5 as packaged in Debian.  shared_buffers is set
to 57344 and sort_mem=4096.

The machine has an AMD 1.8+ and ` gig of RAM.  Here are some relevant
performance statistics:
richard:/usr/share/cups/model# cat /proc/sys/kernel/shmmax
536870912
richard:/usr/share/cups/model# cat /proc/sys/kernel/shmall
536870912
richard:/home/richard# hdparm -tT /dev/hda
 Timing cached reads:   1112 MB in  2.00 seconds = 556.00 MB/sec
 Timing buffered disk reads:  176 MB in  3.02 seconds =  58.28 MB/sec

I have included an EXPLAIN ANALYZE, relevant table counts, and relevant
indexing information.  If anyone has any suggestions on how to improve
performance....  TIA!

SELECT tab.answer_id, client_id, question_id, recordset_id,
date_effective, virt_field_name
FROM
(
    SELECT a.uid AS answer_id, a.client_id, a.question_id, recordset_id,
date_effective
    FROM da_answer a
    WHERE a.date_effective <= 9999999999.0
    AND a.inactive != 1
    AND
    (
            5000 = 4000 
            OR
            (EXISTS (SELECT * FROM svp_getparentproviderids(1) WHERE
svp_getparentproviderids = a.provider_id))
    )
    UNION
    SELECT a.uid AS answer_id, a.client_id, a.question_id, recordset_id,
date_effective
    FROM da_answer a,
    (    
        SELECT main_id 
        FROM da_data_restrict
        WHERE type_id = 2        
        AND (provider_id IN (SELECT * FROM svp_getparentproviderids(1)))
               
        UNION
            
            SELECT sa.uid AS main_id 
            FROM da_answer sa
                JOIN da_data_restrict_except_closed dr ON dr.main_id =
sa.uid AND dr.type_id = 2 AND dr.except_provider_id = 1
            WHERE (restricted = 1) 
                AND (restricted_closed_except = 1)                
                AND sa.covered_by_roi = 1
            UNION
            SELECT sa.uid AS main_id 
            FROM da_answer sa
            WHERE (restricted = 0) 
                AND (restricted_open_except = 1)                
                AND (NOT EXISTS (SELECT dr.main_id FROM
da_data_restrict_except_open dr WHERE (dr.main_id = sa.uid) AND
(dr.type_id = 2) AND (dr.except_provider_id in (select * from
svp_getparentproviderids(1)))))
                AND sa.covered_by_roi = 1
            UNION
            SELECT sa.uid AS main_id FROM da_answer sa WHERE (restricted
= 0) AND (restricted_open_except = 0)
                AND sa.covered_by_roi = 1
    ) sec
    WHERE a.covered_by_roi = 1
        AND a.date_effective <= 9999999999.0
        AND a.inactive != 1
        AND a.uid = sec.main_id
        AND 5000 > 4000
) tab, da_question q
WHERE tab.question_id = q.uid AND (min_access_level <= 4000 OR
min_access_level IS NULL)

Table counts from relevant tables
da_question                      1095
da_answer                       21117
da_question                      1095
da_data_restrict_except_closed   3087
da_data_restrict_except_open    13391
svp_getparentproviderids(1)         1

Relevant Index
create index in_da_data_restrict_provider_id on
da_data_restrict(provider_id);
create index in_da_data_restrict_main_id on da_data_restrict(main_id);
create index in_da_data_restrict_type_id on da_data_restrict(type_id);
create index in_da_data_restrict_client_id on
da_data_restrict(client_id);
create index in_da_dr_type_provider on
da_data_restrict(type_id,provider_id);

create index in_da_data_rec_provider_id ON
da_data_restrict_except_closed(provider_id);
create index in_da_data_rec_type_id ON
da_data_restrict_except_closed(type_id);
create index in_da_data_rec_main_id ON
da_data_restrict_except_closed(main_id);
create index in_da_data_rec_except_provider_id ON
da_data_restrict_except_closed(except_provider_id);

create index in_da_data_reo_provider_id ON
da_data_restrict_except_open(provider_id);
create index in_da_data_reo_type_id ON
da_data_restrict_except_open(type_id);
create index in_da_data_reo_main_id ON
da_data_restrict_except_open(main_id);
create index in_da_data_reo_except_provider_id ON
da_data_restrict_except_open(except_provider_id);

create index in_da_answer_client_id ON da_answer(client_id);
create index in_da_answer_provider_id ON da_answer(provider_id);
create index in_da_answer_question_id ON da_answer(question_id);
create index in_da_answer_recordset_id ON da_answer(recordset_id);
create index in_da_answer_restricted ON da_answer(restricted);
create index in_da_answer_restricted_open_except ON
da_answer(restricted_open_except);
create index in_da_answer_restricted_closed_except ON
da_answer(restricted_closed_except);
create index in_da_answer_date_effective ON da_answer(date_effective);
create index in_da_answer_inactive ON da_answer(inactive);
create index in_da_answer_covered_by_roi ON da_answer(covered_by_roi);

create index in_da_ed_inactive_roi ON da_answer(date_effective,inactive,
covered_by_roi);

create index in_da_question_mal ON da_question(min_access_level);

                                                                                
                                     QUERY PLAN                                 
                                                                                
    
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Hash Join  (cost=1054186.23..1054631.36 rows=4496 width=67) (actual 
time=4902.250..4979.060 rows=7653 loops=1)
   Hash Cond: ("outer".question_id = "inner".uid)
   ->  Subquery Scan tab  (cost=1054123.62..1054457.09 rows=13339 width=24) 
(actual time=4896.963..4933.896 rows=7657 loops=1)
         ->  Unique  (cost=1054123.62..1054323.70 rows=13339 width=24) (actual 
time=4896.948..4915.498 rows=7657 loops=1)
               ->  Sort  (cost=1054123.62..1054156.96 rows=13339 width=24) 
(actual time=4896.944..4903.402 rows=7717 loops=1)
                     Sort Key: answer_id, client_id, question_id, recordset_id, 
date_effective
                     ->  Append  (cost=0.00..1053209.67 rows=13339 width=24) 
(actual time=279.091..4841.605 rows=7717 loops=1)
                           ->  Subquery Scan "*SELECT* 1"  (cost=0.00..64034.15 
rows=10540 width=24) (actual time=279.089..4419.371 rows=161 loops=1)
                                 ->  Seq Scan on da_answer a  
(cost=0.00..63928.75 rows=10540 width=24) (actual time=279.080..4418.808 
rows=161 loops=1)
                                       Filter: ((date_effective <= 
9999999999::double precision) AND (inactive <> 1) AND (subplan))
                                       SubPlan
                                         ->  Function Scan on 
svp_getparentproviderids  (cost=0.00..15.00 rows=5 width=4) (actual 
time=0.203..0.203 rows=0 loops=21089)
                                               Filter: 
(svp_getparentproviderids = $1)
                           ->  Subquery Scan "*SELECT* 2"  
(cost=988627.58..989175.52 rows=2799 width=24) (actual time=290.730..417.720 
rows=7556 loops=1)
                                 ->  Hash Join  (cost=988627.58..989147.53 
rows=2799 width=24) (actual time=290.722..395.739 rows=7556 loops=1)
                                       Hash Cond: ("outer".main_id = 
"inner".uid)
                                       ->  Subquery Scan sec  
(cost=987913.23..988002.59 rows=5957 width=4) (actual time=203.862..225.462 
rows=7567 loops=1)
                                             ->  Unique  
(cost=987913.23..987943.02 rows=5957 width=4) (actual time=203.851..215.834 
rows=7567 loops=1)
                                                   ->  Sort  
(cost=987913.23..987928.12 rows=5957 width=4) (actual time=203.843..207.273 
rows=7567 loops=1)
                                                         Sort Key: main_id
                                                         ->  Append  
(cost=160.61..987539.72 rows=5957 width=4) (actual time=35.798..187.293 
rows=7567 loops=1)
                                                               ->  Subquery 
Scan "*SELECT* 1"  (cost=160.61..164.53 rows=46 width=4) (actual 
time=35.796..35.923 rows=35 loops=1)
                                                                     ->  Hash 
Join  (cost=160.61..164.07 rows=46 width=4) (actual time=35.791..35.868 rows=35 
loops=1)
                                                                           Hash 
Cond: ("outer".svp_getparentproviderids = "inner".provider_id)
                                                                           ->  
HashAggregate  (cost=15.00..15.00 rows=200 width=4) (actual time=0.491..0.492 
rows=1 loops=1)
                                                                                
 ->  Function Scan on svp_getparentproviderids  (cost=0.00..12.50 rows=1000 
width=4) (actual time=0.473..0.474 rows=1 loops=1)
                                                                           ->  
Hash  (cost=145.50..145.50 rows=46 width=8) (actual time=35.256..35.256 rows=0 
loops=1)
                                                                                
 ->  Index Scan using in_da_dr_type_provider on da_data_restrict  
(cost=0.00..145.50 rows=46 width=8) (actual time=0.041..26.627 rows=7280 
loops=1)
                                                                                
       Index Cond: (type_id = 2)
                                                               ->  Subquery 
Scan "*SELECT* 2"  (cost=0.00..53.69 rows=1 width=4) (actual time=0.030..0.030 
rows=0 loops=1)
                                                                     ->  Nested 
Loop  (cost=0.00..53.68 rows=1 width=4) (actual time=0.028..0.028 rows=0 
loops=1)
                                                                           ->  
Index Scan using in_da_data_rec_except_provider_id on 
da_data_restrict_except_closed dr  (cost=0.00..50.65 rows=1 width=4) (actual 
time=0.026..0.026 rows=0 loops=1)
                                                                                
 Index Cond: (except_provider_id = 1)
                                                                                
 Filter: (type_id = 2)
                                                                           ->  
Index Scan using da_answer_pkey on da_answer sa  (cost=0.00..3.02 rows=1 
width=4) (never executed)
                                                                                
 Index Cond: ("outer".main_id = sa.uid)
                                                                                
 Filter: ((restricted = 1) AND (restricted_closed_except = 1) AND 
(covered_by_roi = 1))
                                                               ->  Subquery 
Scan "*SELECT* 3"  (cost=0.00..986638.62 rows=678 width=4) (actual 
time=0.346..77.393 rows=1841 loops=1)
                                                                     ->  Index 
Scan using in_da_answer_restricted_open_except on da_answer sa  
(cost=0.00..986631.84 rows=678 width=4) (actual time=0.342..74.614 rows=1841 
loops=1)
                                                                           
Index Cond: (restricted_open_except = 1)
                                                                           
Filter: ((restricted = 0) AND (covered_by_roi = 1) AND (NOT (subplan)))
                                                                           
SubPlan
                                                                             -> 
 Nested Loop IN Join  (cost=0.00..227.09 rows=1 width=4) (actual 
time=0.026..0.026 rows=0 loops=1841)
                                                                                
   Join Filter: ("outer".except_provider_id = "inner".svp_getparentproviderids)
                                                                                
   ->  Index Scan using in_da_data_reo_main_id on da_data_restrict_except_open 
dr  (cost=0.00..212.09 rows=1 width=8) (actual time=0.009..0.014 rows=2 
loops=1841)
                                                                                
         Index Cond: (main_id = $0)
                                                                                
         Filter: (type_id = 2)
                                                                                
   ->  Function Scan on svp_getparentproviderids  (cost=0.00..12.50 rows=1000 
width=4) (actual time=0.002..0.003 rows=1 loops=3793)
                                                               ->  Subquery 
Scan "*SELECT* 4"  (cost=0.00..682.87 rows=5232 width=4) (actual 
time=0.064..69.107 rows=5691 loops=1)
                                                                     ->  Seq 
Scan on da_answer sa  (cost=0.00..630.55 rows=5232 width=4) (actual 
time=0.059..61.052 rows=5691 loops=1)
                                                                           
Filter: ((restricted = 0) AND (restricted_open_except = 0) AND (covered_by_roi 
= 1))
                                       ->  Hash  (cost=630.55..630.55 rows=9922 
width=24) (actual time=86.699..86.699 rows=0 loops=1)
                                             ->  Seq Scan on da_answer a  
(cost=0.00..630.55 rows=9922 width=24) (actual time=0.043..73.232 rows=10062 
loops=1)
                                                   Filter: ((covered_by_roi = 
1) AND (date_effective <= 9999999999::double precision) AND (inactive <> 1))
   ->  Hash  (cost=61.69..61.69 rows=369 width=47) (actual time=5.241..5.241 
rows=0 loops=1)
         ->  Seq Scan on da_question q  (cost=0.00..61.69 rows=369 width=47) 
(actual time=0.026..4.071 rows=1087 loops=1)
               Filter: ((min_access_level <= 4000) OR (min_access_level IS 
NULL))
 Total runtime: 4986.508 ms
(58 rows)

 count 
-------
 21117
(1 row)

---------------------------(end of broadcast)---------------------------
TIP 9: the planner will ignore your desire to choose an index scan if your
      joining column's datatypes do not match

Reply via email to