"Tom Lane" <[EMAIL PROTECTED]> writes:

> Stefan Kaltenbrunner <[EMAIL PROTECTED]> writes:
>> http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=lionfish&dt=2007-07-24%2005:30:13
>> any ideas ?
>
> I saw what I think was the identical failure last night on my own
> machine, but it wasn't repeatable.  Evidently the planner is changing to
> a different plan for those queries, but why has this only started
> recently?  Maybe the recent changes to autovacuum defaults are causing
> autovac to hit these tables when it never did before?

Indeed the only alternate plan I can imagine for this is to do the join the
other way around. And given the large difference in sizes between the two
tables the only way I could get that to happen was by obliterating the
statistics entirely for one table but having stats for the other.

This does raise a possible issue with autovacuum. Treating ANALYZE like VACUUM
and running it on individual tables one at a time is probably the wrong thing
to be doing. What really has to happen is it should run analyze on all tables
together in a single transaction and commit all the new stats together.
Out-of-sync stats can be worse than out-of-date stats.


With stats on timestamp_tbl but not on interval_tbl:

postgres-#                                                                     
QUERY PLAN                                                                     
---------------------------------------------------------------------------------------------------------------------------------------------------
 Nested Loop  (cost=2.02..48.29 rows=432 width=24) (actual time=0.112..1.515 
rows=104 loops=1)
   ->  Seq Scan on interval_tbl i  (cost=0.00..36.55 rows=9 width=16) (actual 
time=0.036..0.070 rows=2 loops=1)
         Filter: ((f1 >= '00:00:00'::interval) AND (f1 <= '23:00:00'::interval))
   ->  Materialize  (cost=2.02..2.50 rows=48 width=8) (actual time=0.030..0.377 
rows=52 loops=2)
         ->  Seq Scan on timestamp_tbl t  (cost=0.00..1.97 rows=48 width=8) 
(actual time=0.048..0.333 rows=52 loops=1)
               Filter: ((d1 >= '1990-01-01 00:00:00'::timestamp without time 
zone) AND (d1 <= '2001-01-01 00:00:00'::timestamp without time zone))
 Total runtime: 1.904 ms
(7 rows)



All other combinations perform the join the other way around:

With both analyzed:

postgres-#                                                                  
QUERY PLAN                                                                  
---------------------------------------------------------------------------------------------------------------------------------------------
 Nested Loop  (cost=1.15..6.37 rows=144 width=24) (actual time=0.109..1.653 
rows=104 loops=1)
   ->  Seq Scan on timestamp_tbl t  (cost=0.00..1.97 rows=48 width=8) (actual 
time=0.063..0.356 rows=52 loops=1)
         Filter: ((d1 >= '1990-01-01 00:00:00'::timestamp without time zone) 
AND (d1 <= '2001-01-01 00:00:00'::timestamp without time zone))
   ->  Materialize  (cost=1.15..1.18 rows=3 width=16) (actual time=0.003..0.008 
rows=2 loops=52)
         ->  Seq Scan on interval_tbl i  (cost=0.00..1.15 rows=3 width=16) 
(actual time=0.017..0.052 rows=2 loops=1)
               Filter: ((f1 >= '00:00:00'::interval) AND (f1 <= 
'23:00:00'::interval))
 Total runtime: 2.025 ms
(7 rows)


With no stats at all:

postgres-# postgres-#                                                           
       QUERY PLAN                                                               
   
---------------------------------------------------------------------------------------------------------------------------------------------
 Nested Loop  (cost=36.56..80.89 rows=99 width=24) (actual time=0.147..1.698 
rows=104 loops=1)
   ->  Seq Scan on timestamp_tbl t  (cost=0.00..42.10 rows=11 width=8) (actual 
time=0.101..0.388 rows=52 loops=1)
         Filter: ((d1 >= '1990-01-01 00:00:00'::timestamp without time zone) 
AND (d1 <= '2001-01-01 00:00:00'::timestamp without time zone))
   ->  Materialize  (cost=36.56..36.65 rows=9 width=16) (actual 
time=0.003..0.008 rows=2 loops=52)
         ->  Seq Scan on interval_tbl i  (cost=0.00..36.55 rows=9 width=16) 
(actual time=0.017..0.053 rows=2 loops=1)
               Filter: ((f1 >= '00:00:00'::interval) AND (f1 <= 
'23:00:00'::interval))
 Total runtime: 2.063 ms
(7 rows)

With stats on interval_tbl but not timestamp_tbl:

postgres-# postgres-#                                                           
       QUERY PLAN                                                               
   
---------------------------------------------------------------------------------------------------------------------------------------------
 Nested Loop  (cost=1.15..44.00 rows=33 width=24) (actual time=0.100..1.725 
rows=104 loops=1)
   ->  Seq Scan on timestamp_tbl t  (cost=0.00..42.10 rows=11 width=8) (actual 
time=0.055..0.351 rows=52 loops=1)
         Filter: ((d1 >= '1990-01-01 00:00:00'::timestamp without time zone) 
AND (d1 <= '2001-01-01 00:00:00'::timestamp without time zone))
   ->  Materialize  (cost=1.15..1.18 rows=3 width=16) (actual time=0.003..0.008 
rows=2 loops=52)
         ->  Seq Scan on interval_tbl i  (cost=0.00..1.15 rows=3 width=16) 
(actual time=0.016..0.051 rows=2 loops=1)
               Filter: ((f1 >= '00:00:00'::interval) AND (f1 <= 
'23:00:00'::interval))
 Total runtime: 2.104 ms
(7 rows)


-- 
  Gregory Stark
  EnterpriseDB          http://www.enterprisedb.com


---------------------------(end of broadcast)---------------------------
TIP 7: You can help support the PostgreSQL project by donating at

                http://www.postgresql.org/about/donate

Reply via email to