Josh Berkus <[EMAIL PROTECTED]> writes:
> My reading is that the case is "borderline";
Well, clearly the planner is flipping to a much less desirable plan, but
the core estimation error is not borderline by my standards. In the
live DB we have this subplan:
-> Nested Loop (cost=0.00..7.41 rows
Tom,
> Let's see the pg_stats rows for case_clients in both databases. The
> entries for trial_groups might be relevant too.
My reading is that the case is "borderline"; that is, becuase the correlation
is about 10-20% higher on the test database (since it was restored "clean"
from backup) the
Folks,
More followup on this:
The crucial difference between the two execution plans is this clause:
test db has:
-> Seq Scan on case_clients (cost=0.00..3673.48 rows=11274 width=11) (actual
time=0.02..302.20 rows=8822 loops=855)
whereas live db has:
-> Index Scan using idx_caseclients_case
Josh Berkus <[EMAIL PROTECTED]> writes:
> Also, there doesn't seem to be any way in 7.2 for me to find out what the
> current statistics target for a column is. What am I missing?
There still isn't a handy command for it --- you have to look at
pg_attribute.attstattarget for the column.
Tom,
> Partly. The numbers suggest that in ANALYZE's default sample of 3000
> rows, it's only finding about a dozen non-null tgroup_ids (yielding the
> 0.996 null_frac value); and that in one case all dozen are different and
> in the other case there are two duplicates. It would help if you
> bo
Josh Berkus <[EMAIL PROTECTED]> writes:
> Still, they are differences. Attached.
Actually, it was mainly "cases" that I wanted to know about ---
specifically, whichever columns are in "idx_cases_tgroup".
Also, which of the trial_groups columns is the pkey?
regards, tom la
Tom,
> Partly. The numbers suggest that in ANALYZE's default sample of 3000
> rows, it's only finding about a dozen non-null tgroup_ids (yielding the
> 0.996 null_frac value); and that in one case all dozen are different and
> in the other case there are two duplicates. It would help if you
> bo
on Johnson; PgSQL Performance ML
Subject: Re: [PERFORM] Odd problem with performance in duplicate
database
Ron,
> If the databases are on different machines, maybe the postgres.conf
> or pg_hba.conf files are different, and the buffer counts is affect-
> ing the optimizer?
The databa
Josh Berkus <[EMAIL PROTECTED]> writes:
> Tom,
>> Okay, here's our problem:
>>
>> live DB: tgroup_id n_distinct = -1
>>
>> test DN: tgroup_id n_distinct = 11
>>
>> The former estimate actually means that it thinks tgroup_id is a unique
>> column, whereas the latter says there are only 11 disti
Josh Berkus <[EMAIL PROTECTED]> writes:
> By everything I can measure, the live database and the test are
> identical; yet the test does not think that idx_caseclients_case is
> very accessable, and the live database knows it is.
Let's see the pg_stats rows for case_clients in both databases. The
Ron,
> If the databases are on different machines, maybe the postgres.conf
> or pg_hba.conf files are different, and the buffer counts is affect-
> ing the optimizer?
The databases are on the same machine, using the same postmaster.
--
-Josh Berkus
Aglio Database Solutions
San Francisco
---
On Mon, 2003-08-11 at 17:03, Josh Berkus wrote:
> Folks,
>
> More followup on this:
>
> The crucial difference between the two execution plans is this clause:
>
> test db has:
> -> Seq Scan on case_clients (cost=0.00..3673.48 rows=11274 width=11) (actual
> time=0.02..302.20 rows=8822 loops=85
Peter,
> I'm sure that you've thought of this, but it sounds like you may not have
> done an analyze in your new DB.
Yes. Also a VACUUM. Also forcing a REINDEX on the major involved tables.
Also running counts on the pg_* system tables to see if any objects did not
get restored from the
13 matches
Mail list logo