Auto-vacuum has made Postgres a much more "friendly" system.  Is there some 
reason the planner can't also auto-ANALYZE in some situations?

Here's an example I ran into:

  create table my_tmp_table (...);
  insert into my_tmp_table (select some stuff from here and there);
  select ... from my_tmp_table join another_table on (...);

The last statement generated a horrible plan, because the planner had no idea 
what was in the temporary table (which only had about 100 rows in it).  Simply 
inserting an ANALYZE before the SELECT improved performance by a factor of 100 
or so.

There are several situations where you could automatically analyze the data.

1. Any time you have to do a full table scan, you might as well throw in an 
ANALYZE of the data you're scanning.  If I understand things, ANALYZE takes a 
random sample anyway, so a full table scan should be able to produce even 
better statistics than a normal ANALYZE.

2. If you have a table with NO statistics, the chances of generating a sensible 
plan are pretty random.  Since ANALYZE is quite fast, if the planner encounters 
no statistics, why not ANALYZE it on the spot?  (This might need to be a 
configurable feature, though.)

3. A user-configurable update threshold, such as, "When 75% of the rows have changed 
since the last ANALYZE, trigger an auto-analyze."  The user-configurable part would 
account for the fact that some tables stats don't change much even after many updates, 
but others may need to be reanalyzed after a modest number of updates.

Auto-vacuum, combined with auto-analyze, would eliminate many of the problems that plague 
neophyte (and sometimes experienced) users of Postgres.  A substantial percentage of the 
questions to this list are answered with, "Have you ANALYZED?"

Craig

---------------------------(end of broadcast)---------------------------
TIP 6: explain analyze is your friend

Reply via email to