Hello All,

We are struggling with a specific query that is killing us.  When doing 
explain analyze on the entire query, we *seem* to be getting killed by the 
estimated number of rows on a case statement calculation.

I've included a snippet from the explain analyze of the much larger query. The 
line in question, (cost=0.00..106.52 rows=1 width=16) (actual 
time=0.048..67.728 rows=4725 loops=1) shows that it returned 4700 rows 
instead of 1 which when coupled with a later join causes the statement to run 
over 3 minutes.[1] 

It seems that it thinks that the scan on role_id is going to return 1 row, but 
in reality returns 4725 rows.  The case statement causing the problem uses 
todays date to see if a particular row is still active.  Here is a test case 
showing how far off the estimate is from the reality. [2]

I'm not too surprised to see that the estimate is off because it is 
calculated, but does anyone know either how to make the estimate more 
accurate so it picks a better plan, or is there a better way to do a "status" 
function based off of the current date so that it is more efficient?  I've 
played with statistics on this table (racheting them up to 1000) with no 
change in the plan.

Any thoughts?

-Chris

[1] explain analyze snippet from larger query
->  Nested Loop  (cost=0.00..955.70 rows=1 width=204) (actual 
time=3096.689..202704.649 rows=17 loops=1)
      Join Filter: ("inner".nameid = "outer".name_id)
      ->  Nested Loop  (cost=0.00..112.25 rows=1 width=33) (actual 
time=0.271..90.760 rows=4725 loops=1)
            ->  Index Scan using role_definition_description_idx on 
role_definition rdf  (cost=0.00..5.72 rows=1 width=21) (actual 
time=0.215..0.218 rows=1 loops=1)
                  Index Cond: (description = 'Participant'::text)
                  Filter: (program_id = 120)
            ->  Index Scan using roles_role_id_idx on roles rol  
(cost=0.00..106.52 rows=1 width=16) (actual time=0.048..67.728 rows=4725 
loops=1)
                  Index Cond: (rol.role_id = "outer".role_id)
                  Filter: (CASE WHEN (role_id IS NULL) THEN NULL::text WHEN 
(("begin" IS NOT NULL) AND ("end" IS NOT NULL)) THEN CASE WHEN 
((('now'::text)::date >= "begin") AND (('now'::text)::date <= "end")) THEN 
'Active'::text ELSE 'Inactive'::text END WHEN ("begin" IS NOT NULL) THEN CASE 
WHEN (('now'::text)::date >= "begin") THEN 'Active'::text ELSE 
'Inactive'::text END WHEN ("end" IS NOT NULL) THEN CASE WHEN 
(('now'::text)::date <= "end") THEN 'Active'::text ELSE 'Inactive'::text END 
ELSE 'Active'::text END = 'Active'::text)
      ->  Nested Loop Left Join  (cost=0.00..842.19 rows=97 width=175) (actual 
time=6.820..42.863 rows=21 loops=4725)
            ->  Index Scan using namemaster_programid_idx on namemaster dem  
(cost=0.00..470.12 rows=97 width=164) (actual time=6.811..42.654 rows=21 
loops=4725)
                  Index Cond: (programid = 120)
                  Filter: ((name_float_lfm ~~* '%clark%'::text) OR 
(metaphone(name_float_lfm, 4) = 'KLRK'::text) OR (soundex(name_float_lfm) = 
'C462'::text))
            ->  Index Scan using validanswerid_pk on validanswer ina  
(cost=0.00..3.82 rows=1 width=19) (actual time=0.003..0.004 rows=1 
loops=99225)
                  Index Cond: (ina.validanswerid = "outer".inactive)

---------------------
[2] A much simpler statement triggers the incorrect row counts here.

explain analyze
select * 
from roles rol
where  

            CASE
           WHEN rol.role_id IS NULL
           THEN NULL
           WHEN rol."begin" IS NOT NULL and rol."end" IS NOT NULL
           THEN
             CASE WHEN TIMESTAMP 'now'>=rol."begin" and TIMESTAMP 
'now'<=rol."end"
             THEN 'Active'
             ELSE 'Inactive' END
           WHEN rol."begin" IS NOT NULL
           THEN
             CASE WHEN TIMESTAMP 'now'>=rol."begin"
             THEN 'Active'
             ELSE 'Inactive' END
           WHEN rol."end" IS NOT NULL
           THEN
             CASE WHEN TIMESTAMP 'now'<=rol."end"
             THEN 'Active'
             ELSE 'Inactive' END
           ELSE 'Active'
           END = 'Active'

Seq Scan on roles rol  (cost=0.00..2368.54 rows=413 width=20) (actual 
time=0.046..562.635 rows=79424 loops=1)
   Filter: (CASE WHEN (role_id IS NULL) THEN NULL::text WHEN (("begin" IS NOT 
NULL) AND ("end" IS NOT NULL)) THEN CASE WHEN (('2005-09-13 
16:43:18.721214'::timestamp without time zone >= "begin") AND ('2005-09-13 
16:43:18.721214'::timestamp without time zone <= "end")) THEN 'Active'::text 
ELSE 'Inactive'::text END WHEN ("begin" IS NOT NULL) THEN CASE WHEN 
('2005-09-13 16:43:18.721214'::timestamp without time zone >= "begin") THEN 
'Active'::text ELSE 'Inactive'::text END WHEN ("end" IS NOT NULL) THEN CASE 
WHEN ('2005-09-13 16:43:18.721214'::timestamp without time zone <= "end") 
THEN 'Active'::text ELSE 'Inactive'::text END ELSE 'Active'::text END = 
'Active'::text)
 Total runtime: 884.456 ms
(3 rows)
-- 
Chris Kratz

---------------------------(end of broadcast)---------------------------
TIP 1: if posting/reading through Usenet, please send an appropriate
       subscribe-nomail command to [EMAIL PROTECTED] so that your
       message can get through to the mailing list cleanly

Reply via email to