Re: [PERFORM] Odd problem with performance in duplicate database

2003-08-14 Thread Josh Berkus
Ron, If the databases are on different machines, maybe the postgres.conf or pg_hba.conf files are different, and the buffer counts is affect- ing the optimizer? The databases are on the same machine, using the same postmaster. -- -Josh Berkus Aglio Database Solutions San Francisco

Re: [PERFORM] Odd problem with performance in duplicate database

2003-08-14 Thread Josh Berkus
Tom, Partly. The numbers suggest that in ANALYZE's default sample of 3000 rows, it's only finding about a dozen non-null tgroup_ids (yielding the 0.996 null_frac value); and that in one case all dozen are different and in the other case there are two duplicates. It would help if you

Re: [PERFORM] Odd problem with performance in duplicate database

2003-08-14 Thread Tom Lane
Josh Berkus [EMAIL PROTECTED] writes: Also, there doesn't seem to be any way in 7.2 for me to find out what the current statistics target for a column is. What am I missing? There still isn't a handy command for it --- you have to look at pg_attribute.attstattarget for the column.

Re: [PERFORM] Odd problem with performance in duplicate database

2003-08-14 Thread Peter Darley
; PgSQL Performance ML Subject: Re: [PERFORM] Odd problem with performance in duplicate database Ron, If the databases are on different machines, maybe the postgres.conf or pg_hba.conf files are different, and the buffer counts is affect- ing the optimizer? The databases are on the same machine

Re: [PERFORM] Odd problem with performance in duplicate database

2003-08-14 Thread Tom Lane
Josh Berkus [EMAIL PROTECTED] writes: Tom, Okay, here's our problem: live DB: tgroup_id n_distinct = -1 test DN: tgroup_id n_distinct = 11 The former estimate actually means that it thinks tgroup_id is a unique column, whereas the latter says there are only 11 distinct values in the

Re: [PERFORM] Odd problem with performance in duplicate database

2003-08-14 Thread Josh Berkus
Tom, Partly. The numbers suggest that in ANALYZE's default sample of 3000 rows, it's only finding about a dozen non-null tgroup_ids (yielding the 0.996 null_frac value); and that in one case all dozen are different and in the other case there are two duplicates. It would help if you

Re: [PERFORM] Odd problem with performance in duplicate database

2003-08-14 Thread Josh Berkus
Folks, More followup on this: The crucial difference between the two execution plans is this clause: test db has: - Seq Scan on case_clients (cost=0.00..3673.48 rows=11274 width=11) (actual time=0.02..302.20 rows=8822 loops=855) whereas live db has: - Index Scan using idx_caseclients_case

Re: [PERFORM] Odd problem with performance in duplicate database

2003-08-14 Thread Josh Berkus
Tom, Let's see the pg_stats rows for case_clients in both databases. The entries for trial_groups might be relevant too. My reading is that the case is borderline; that is, becuase the correlation is about 10-20% higher on the test database (since it was restored clean from backup) the