Re: [PERFORM] Bad plan after vacuum analyze
Ð ÐÑÐ, 11/05/2005 Ð 22:59 +0200, Guillaume Smet ÐÐÑÐÑ: > Anyway, I tried to work on the statistics as you told me and here are > the results: > ccm_perf=# ALTER TABLE acs_objects ALTER COLUMN object_id SET STATISTICS 30; > ALTER TABLE > ccm_perf=# ANALYZE acs_objects; > ANALYZE > > ccm_perf=# \i query_section.sql > ... correct plan ... > Total runtime: 0.555 ms Given Tom's analysis, how can increasing the stats target change which plan is chosen? -- Markus Bertheau <[EMAIL PROTECTED]> signature.asc Description: This is a digitally signed message part
Re: [PERFORM] Bad plan after vacuum analyze
Quoting Guillaume Smet <[EMAIL PROTECTED]>: > Hi, > > We have some performances problem on a particular query. ... I have to say it, this was the best laid-out set of details behind a problem I've ever seen on this list; I'm going to try live up to it, the next time I have a problem of my own. ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] Bad plan after vacuum analyze
Josh, Tom, Thanks for your explanations. In the meantime it seems like the quickest answer for Guillaume might be to try to avoid keeping any NULLs in parent_application_id. I can't do that as the majority of the applications don't have any parent one. Moreover, we use a third party application and we cannot modify all its internals. Anyway, I tried to work on the statistics as you told me and here are the results: ccm_perf=# ALTER TABLE acs_objects ALTER COLUMN object_id SET STATISTICS 30; ALTER TABLE ccm_perf=# ANALYZE acs_objects; ANALYZE ccm_perf=# \i query_section.sql ... correct plan ... Total runtime: 0.555 ms So I think I will use this solution for the moment. Thanks a lot for your help. Regards -- Guillaume ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] Bad plan after vacuum analyze
Ah-ha, I can replicate the problem. This example uses tenk1 from the regression database, which has a column unique2 containing just the integers 0... regression=# create table t1(f1 int); CREATE TABLE regression=# insert into t1 values(5); INSERT 154632 1 regression=# insert into t1 values(7); INSERT 154633 1 regression=# analyze t1; ANALYZE regression=# explain analyze select * from tenk1 right join t1 on (unique2=f1); QUERY PLAN --- Merge Right Join (cost=1.03..1.37 rows=2 width=248) (actual time=0.507..0.617 rows=2 loops=1) Merge Cond: ("outer".unique2 = "inner".f1) -> Index Scan using tenk1_unique2 on tenk1 (cost=0.00..498.24 rows=10024 width=244) (actual time=0.126..0.242 rows=9 loops=1) -> Sort (cost=1.03..1.03 rows=2 width=4) (actual time=0.145..0.153 rows=2 loops=1) Sort Key: t1.f1 -> Seq Scan on t1 (cost=0.00..1.02 rows=2 width=4) (actual time=0.029..0.049 rows=2 loops=1) Total runtime: 1.497 ms (7 rows) The planner correctly perceives that only a small part of the unique2 index will need to be scanned, and hence thinks the merge is cheap --- much cheaper than if the whole index had to be scanned. And it is. Notice that only 9 rows were actually pulled from the index. Once we got to unique2 = 8, nodeMergejoin.c could see that no more matches to f1 were possible. But watch this: regression=# insert into t1 values(null); INSERT 154634 1 regression=# explain analyze select * from tenk1 right join t1 on (unique2=f1); QUERY PLAN - Merge Right Join (cost=1.03..1.37 rows=2 width=248) (actual time=0.560..290.874 rows=3 loops=1) Merge Cond: ("outer".unique2 = "inner".f1) -> Index Scan using tenk1_unique2 on tenk1 (cost=0.00..498.24 rows=10024 width=244) (actual time=0.139..106.982 rows=1 loops=1) -> Sort (cost=1.03..1.03 rows=2 width=4) (actual time=0.181..0.194 rows=3 loops=1) Sort Key: t1.f1 -> Seq Scan on t1 (cost=0.00..1.02 rows=2 width=4) (actual time=0.032..0.067 rows=3 loops=1) Total runtime: 291.670 ms (7 rows) See what happened to the actual costs of the indexscan? All of a sudden we had to scan the whole index because there was a null in the other input, and nulls sort high. I wonder if it is worth fixing nodeMergejoin.c to not even try to match nulls to the other input. We'd have to add a check to see if the join operator is strict or not, but it nearly always will be. The alternative would be to make the planner only believe in the short-circuit path occuring if it thinks that the other input is entirely non-null ... but this seems pretty fragile, since it only takes one null to mess things up, and ANALYZE can hardly be counted on to detect one null in a table. In the meantime it seems like the quickest answer for Guillaume might be to try to avoid keeping any NULLs in parent_application_id. regards, tom lane ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] Bad plan after vacuum analyze
> Well, those stats certainly appear to justify the planner's belief that > the indexscan needn't run very far: the one value of > parent_application_id is 1031 and this is below the smallest value of > object_id seen by analyze. Yes, it seems rather logical but why does it cost so much if it should be an effective way to find the row? > You might have better luck if you increase > the statistics target for acs_objects.object_id. What do you mean exactly? > (It'd be interesting > to know what fraction of acs_objects actually does have object_id < 1032.) ccm_perf=# SELECT COUNT(*) FROM acs_objects WHERE object_id<1032; count --- 15 ccm_perf=# SELECT COUNT(*) FROM acs_objects; count --- 33510 -- Guillaume ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [PERFORM] Bad plan after vacuum analyze
Guillaume Smet <[EMAIL PROTECTED]> writes: >> If so, can we see the pg_stats rows for the object_id and >> parent_application_id columns? > See attached file. Well, those stats certainly appear to justify the planner's belief that the indexscan needn't run very far: the one value of parent_application_id is 1031 and this is below the smallest value of object_id seen by analyze. You might have better luck if you increase the statistics target for acs_objects.object_id. (It'd be interesting to know what fraction of acs_objects actually does have object_id < 1032.) regards, tom lane ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] Bad plan after vacuum analyze
Tom, So, the usual questions: have these two tables been ANALYZEd lately? Yes, of course. As I wrote in my previous mail, here is how I reproduce the problem: - we load the dump in a new database (to be sure, there is no problem on an index or something like that) - query: it's fast (< 1ms) - *VACUUM FULL ANALYZE;* - query: it's really slow (130ms) and it's another plan - set enable_seqscan=off; - query: it's fast (< 1ms) : it uses the best plan I reproduced it on two different servers exactly like that (7.4.5 and 7.4.7). I first met the problem on a production database with a VACUUM ANALYZE run every night (and we don't have too many inserts a day on this database). If so, can we see the pg_stats rows for the object_id and parent_application_id columns? See attached file. If you're interested in a dump of these tables, just tell me. There aren't any confidential information in them. Regards -- Guillaume schemaname | tablename | attname | null_frac | avg_width | n_distinct | most_common_vals | most_common_freqs | histogram_bounds | correlation +-+---+---+---++--+---+-+- public | acs_objects | object_id | 0 | 4 | -1 | | | {1032,34143,112295,120811,285004,420038,449980,453451,457684,609292,710005} | 0.488069 (1 ligne) schemaname | tablename |attname| null_frac | avg_width | n_distinct | most_common_vals | most_common_freqs | histogram_bounds | correlation +--+---+---+---++--+---+--+- public | applications | parent_application_id | 0.928571 | 4 | 1 | {1031} | {0.0714286} | | 1 (1 ligne) ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] Bad plan after vacuum analyze
Josh Berkus writes: > -> Merge Right Join (cost=8.92..9.26 rows=1 width=529) (actual > time=129.100..129.103 rows=1 loops=1) >Merge Cond: ("outer".object_id = "inner".parent_application_id) >-> Index Scan using acs_objects_object_id_p_hhkb1 on > acs_objects t98 (cost=0.00..2554.07 rows=33510 width=81) (actual > time=0.043..56.392 rows=33510 loops=1) >-> Sort (cost=8.92..8.93 rows=1 width=452) (actual > time=0.309..0.310 rows=1 loops=1) > Sort Key: t22.parent_application_id > Here the planner chooses a merge right join. This decision seems to have > been > made entirely on the basis of the cost of the join itself (total of 17) > without taking the cost of the sort and index access (total of 2600+) into > account. > Tom, is this a possible error in planner logic? No, it certainly hasn't forgotten to add in the costs of the inputs. There might be a bug here, but if so it's much more subtle than that. It looks to me like the planner believes that the one value of t22.parent_application_id joins to something very early in the acs_objects_object_id_p_hhkb1 sort order, and that it will therefore not be necessary to run the indexscan to completion (or indeed very far at all, considering that it's including such a small fraction of the total indexscan cost). [EMAIL PROTECTED] pointed out recently that this effect doesn't apply to the outer side of an outer join; releases before 7.4.8 mistakenly think it does. But unless my wires are totally crossed today, acs_objects is the nullable side here and so that error isn't applicable anyway. So, the usual questions: have these two tables been ANALYZEd lately? If so, can we see the pg_stats rows for the object_id and parent_application_id columns? regards, tom lane ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] Bad plan after vacuum analyze
Guillaume, > We reproduced the problem on a 7.4.5 and on a 7.4.7 server. > * we load the dump in a new database > * query: it's fast (< 1ms) > * VACUUM FULL ANALYZE; > * query: it's really slow (130ms) and it's another plan > * set enable_seqscan=off; > * query: it's fast (< 1ms) : it uses the best plan Looking at this, the planner seems convinced that the merge join is the easiest way to do the OUTER JOINS, but it appears to be wrong; a nested loop is faster. This isn't the only place I've encountered our optimizer doing this -- underestimating the cost of a merge join. This seems to be becuase the merge_join vs. nested_loop decision seems to be being made in the planner without taking the double-sort and index access costs into account. This query is an excellent example: "good" plan: Nested Loop Left Join (cost=2.44..17.36 rows=1 width=5532) (actual time=0.441..0.466 rows=1 loops=1) Join Filter: ("outer".parent_application_id = "inner".application_id) -> Nested Loop Left Join (cost=2.44..15.73 rows=1 width=5214) (actual time=0.378..0.402 rows=1 loops=1) See, here the planner thinks that the 2 nested loops will cost "35". "bad" plan: Merge Right Join (cost=9.27..9.48 rows=1 width=545) (actual time=129.364..129.365 rows=1 loops=1) Merge Cond: ("outer".application_id = "inner".parent_application_id) -> Index Scan using applicati_applicati_id_p_ogstm on applications t116 (cost=0.00..5.51 rows=28 width=20) (actual time=0.030..0.073 rows=28 loops=1) -> Sort (cost=9.27..9.27 rows=1 width=529) (actual time=129.202..129.203 rows=1 loops=1) Sort Key: t22.parent_application_id -> Merge Right Join (cost=8.92..9.26 rows=1 width=529) (actual time=129.100..129.103 rows=1 loops=1) Merge Cond: ("outer".object_id = "inner".parent_application_id) -> Index Scan using acs_objects_object_id_p_hhkb1 on acs_objects t98 (cost=0.00..2554.07 rows=33510 width=81) (actual time=0.043..56.392 rows=33510 loops=1) -> Sort (cost=8.92..8.93 rows=1 width=452) (actual time=0.309..0.310 rows=1 loops=1) Sort Key: t22.parent_application_id Here the planner chooses a merge right join. This decision seems to have been made entirely on the basis of the cost of the join itself (total of 17) without taking the cost of the sort and index access (total of 2600+) into account. Tom, is this a possible error in planner logic? -- --Josh Josh Berkus Aglio Database Solutions San Francisco ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly