Hi all;

we've been fighting this query for a few days now. we bumped up the statistict 
target for the a.id , c.url_hits_id and the b.id columns below to 250 and ran 
an analyze on the relevant tables.  we killed it after 8hrs.  

Note the url_hits table has > 1.4billion rows

Any suggestions?



$ psql -ef expl.sql pwreport                                                
explain                                                                         
                    
select                                                                          
                    
a.id,                                                                           
                    
ident_id,                                                                       
                    
time,                                                                           
                    
customer_name,                                                                  
                    
extract('day' from timezone(e.name, to_timestamp(a.time))) as day,              
                    
category_id                                                                     
                    
from                                                                            
                    
pwreport.url_hits a left outer join                                             
                    
pwreport.url_hits_category_jt c on (a.id = c.url_hits_id),                      
                    
pwreport.ident b,                                                               
                    
pwreport.timezone e                                                             
                    
where                                                                           
                    
a.ident_id = b.id                                                               
                    
and b.timezone_id = e.id                                                        
                    
and time >= extract ('epoch' from timestamp '2009-08-12')                       
                    
and time < extract ('epoch' from timestamp '2009-08-13' )                       
                    
and direction = 'REQUEST'                                                       
                    
;
                                                                                
                     QUERY 
PLAN                                                                            
             
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Merge Right Join  (cost=47528508.61..180424544.59 rows=10409251 width=53)
   Merge Cond: (c.url_hits_id = a.id)
   ->  Index Scan using mt_url_hits_category_jt_url_hits_id_index on 
url_hits_category_jt c  (cost=0.00..122162596.63 rows=4189283233 width=8)
   ->  Sort  (cost=47528508.61..47536931.63 rows=3369210 width=49)
         Sort Key: a.id
         ->  Hash Join  (cost=2565.00..47163219.21 rows=3369210 width=49)
               Hash Cond: (b.timezone_id = e.id)
               ->  Hash Join  (cost=2553.49..47116881.07 rows=3369210 
width=37)
                     Hash Cond: (a.ident_id = b.id)
                     ->  Seq Scan on url_hits a  (cost=0.00..47051154.89 
rows=3369210 width=12)
                           Filter: ((direction = 
'REQUEST'::proxy_direction_enum) AND (("time")::double precision >= 
1250035200::double precision) AND (("time")::double precision < 
1250121600::double precision))
                     ->  Hash  (cost=2020.44..2020.44 rows=42644 width=29)
                           ->  Seq Scan on ident b  (cost=0.00..2020.44 
rows=42644 width=29)
               ->  Hash  (cost=6.78..6.78 rows=378 width=20)
                     ->  Seq Scan on timezone e  (cost=0.00..6.78 rows=378 
width=20)
(15 rows)


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Reply via email to