Re: [PERFORM] Odd problem with performance in duplicate database
Ron, If the databases are on different machines, maybe the postgres.conf or pg_hba.conf files are different, and the buffer counts is affect- ing the optimizer? The databases are on the same machine, using the same postmaster. -- -Josh Berkus Aglio Database Solutions San Francisco ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [PERFORM] Odd problem with performance in duplicate database
Tom, Partly. The numbers suggest that in ANALYZE's default sample of 3000 rows, it's only finding about a dozen non-null tgroup_ids (yielding the 0.996 null_frac value); and that in one case all dozen are different and in the other case there are two duplicates. It would help if you boosted the stats target for this column by a factor of 10. (You can do that in 7.2, btw --- IIRC the only problem is that a pg_dump won't show that you did so.) Hmmm. No dice. I raised the selectivity to 1000, which increased n_distinct to 108, which is pretty close to accurate. However, the planner still insists on using a seq scan on case_clients unless I drop random_page_cost to 1.5 (which is up from 1.2 but still somewhat unreasonable). But the other part of the problem is that in 7.2, the join selectivity estimator is way off when you are joining a unique column (like the pkey on the other side) to a column with a very large fraction of nulls. We only discovered this recently; it's fixed as of 7.3.3: OK, I'll talk to the client about upgrading. -- -Josh Berkus __AGLIO DATABASE SOLUTIONS___ Josh Berkus Complete information technology [EMAIL PROTECTED] and data management solutions (415) 565-7293 for law firms, small businesses fax 621-2533 and non-profit organizations. San Francisco ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [PERFORM] Odd problem with performance in duplicate database
Josh Berkus [EMAIL PROTECTED] writes: Also, there doesn't seem to be any way in 7.2 for me to find out what the current statistics target for a column is. What am I missing? There still isn't a handy command for it --- you have to look at pg_attribute.attstattarget for the column. regards, tom lane ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] Odd problem with performance in duplicate database
Josh, I'm sure that you've thought of this, but it sounds like you may not have done an analyze in your new DB. Thanks, Peter Darley -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Behalf Of Josh Berkus Sent: Monday, August 11, 2003 3:48 PM To: Ron Johnson; PgSQL Performance ML Subject: Re: [PERFORM] Odd problem with performance in duplicate database Ron, If the databases are on different machines, maybe the postgres.conf or pg_hba.conf files are different, and the buffer counts is affect- ing the optimizer? The databases are on the same machine, using the same postmaster. -- -Josh Berkus Aglio Database Solutions San Francisco ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED]) ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] Odd problem with performance in duplicate database
Josh Berkus [EMAIL PROTECTED] writes: Tom, Okay, here's our problem: live DB: tgroup_id n_distinct = -1 test DN: tgroup_id n_distinct = 11 The former estimate actually means that it thinks tgroup_id is a unique column, whereas the latter says there are only 11 distinct values in the column. I assume the former is much nearer to the truth (how many rows in cases, and how many distinct tgroup_id values)? The real case is that there are 113 distinct tgroup_ids, which cover about 10% of the population of cases. The other 90% is NULL. The average tgroup_id is shared between 4.7 cases. So this seems like sampling error. Partly. The numbers suggest that in ANALYZE's default sample of 3000 rows, it's only finding about a dozen non-null tgroup_ids (yielding the 0.996 null_frac value); and that in one case all dozen are different and in the other case there are two duplicates. It would help if you boosted the stats target for this column by a factor of 10. (You can do that in 7.2, btw --- IIRC the only problem is that a pg_dump won't show that you did so.) But the other part of the problem is that in 7.2, the join selectivity estimator is way off when you are joining a unique column (like the pkey on the other side) to a column with a very large fraction of nulls. We only discovered this recently; it's fixed as of 7.3.3: 2003-04-15 01:18 tgl * src/backend/utils/adt/selfuncs.c (REL7_3_STABLE): eqjoinsel's logic for case where MCV lists are not present should account for NULLs; in hindsight this is obvious since the code for the MCV-lists case would reduce to this when there are zero entries in both lists. Per example from Alec Mitchell. Possibly you could backpatch that into 7.2, although I'd think an update to 7.3.4 would be a more profitable use of time. regards, tom lane ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [PERFORM] Odd problem with performance in duplicate database
Tom, Partly. The numbers suggest that in ANALYZE's default sample of 3000 rows, it's only finding about a dozen non-null tgroup_ids (yielding the 0.996 null_frac value); and that in one case all dozen are different and in the other case there are two duplicates. It would help if you boosted the stats target for this column by a factor of 10. (You can do that in 7.2, btw --- IIRC the only problem is that a pg_dump won't show that you did so.) Also, there doesn't seem to be any way in 7.2 for me to find out what the current statistics target for a column is. What am I missing? -- -Josh Berkus Aglio Database Solutions San Francisco ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html
Re: [PERFORM] Odd problem with performance in duplicate database
Folks, More followup on this: The crucial difference between the two execution plans is this clause: test db has: - Seq Scan on case_clients (cost=0.00..3673.48 rows=11274 width=11) (actual time=0.02..302.20 rows=8822 loops=855) whereas live db has: - Index Scan using idx_caseclients_case on case_clients (cost=0.00..5.10 rows=1 width=11) (actual time=0.03..0.04 rows=1 loops=471) using an enable_seqscan = false fixes this, but is obviously not a long-term solution. I've re-created the test system from an immediate copy of the live database, and checked that the the main tables and indexes were reproduced faithfully. Lowering random_page_cost seems to do the trick. But I'm still mystified; why would one identical database pick a different plan than its copy? -- -Josh Berkus Aglio Database Solutions San Francisco ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] Odd problem with performance in duplicate database
Tom, Let's see the pg_stats rows for case_clients in both databases. The entries for trial_groups might be relevant too. My reading is that the case is borderline; that is, becuase the correlation is about 10-20% higher on the test database (since it was restored clean from backup) the planner is resorting to a seq scan. At which point the spectre of random_page_cost less than 1.0 rears its ugly head again. Because the planner seems to regard this as a borderline case, but it's far from borderline ... index scan takes 260ms, seq scan takes 244,000ms. Yet my random_page_cost is set pretty low already, at 1.5. It seems like I'd have to set random_page_cost to less than 1.0 to make sure that the planner never used a seq scan. Which kinda defies the meaning of the setting. *sigh* wish the client would pay for an upgrade -- -Josh Berkus Aglio Database Solutions San Francisco ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster