Re: [PERFORM] Left joining against two empty tables makes a query SLOW
On 7/28/05, Chris Travers [EMAIL PROTECTED] wrote: Hi all;I have a customer who currently uses an application which had becomeslow.After doing some digging, I found the slow query:SELECT c.accno, c.description, c.link, c.category, ac.project_id,p.projectnumber ,a.department_id, d.description AS departmentFROM chart c JOIN acc_trans ac ON (ac.chart_id = c.id)JOIN ar a ON (a.id = ac.trans_id)LEFT JOIN project p ON ( ac.project_id = p.id)LEFT JOIN department d ON (d.id = a.department_id)WHERE a.customer_id = 11373 AND a.id IN (SELECT max(id) FROM ar WHERE customer_id = 11373 );(reformatted for readability)This is taking 10 seconds to run.Interestingly, both the project and department tables are blank, and ifI omit them, the query becomes:SELECT c.accno, c.description , c.link, c.category, ac.project_idFROM chart c JOIN acc_trans ac ON (ac.chart_id = c.id)JOIN ar a ON (a.id = ac.trans_id)WHERE a.customer_id = 11373 AND a.id IN (SELECT max(id) FROM ar WHERE customer_id = 11373);This takes 139ms.1% of the previous query.The plan for the long query is:QUERY PLAN Hash IN Join(cost=87337.25..106344.93 rows=41 width=118) (actualtime=7615.843..9850.209 rows=10 loops=1) Hash Cond: (outer.trans_id = inner.max) -Merge Right Join(cost= 86620.57..100889.85 rows=947598width=126) (actual time=7408.830..9200.435 rows=177769 loops=1) Merge Cond: (outer.id = inner.department_id) -Index Scan using department_id_key on department d (cost=0.00..52.66rows=1060 width=36) (actual time=0.090..0.090 rows=0 loops=1) vacuum reindex the department and project table as the planner expects there are 1060 rows but actually returning nothing. -Sort(cost=86620.57..87067.55 rows=178792 width=94)(actual time= 7408.709..7925.843 rows=177769 loops=1) Sort Key: a.department_id -Merge Right Join(cost=45871.18..46952.83rows=178792 width=94) (actual time=4962.122..6671.319 rows=177769 loops=1) Merge Cond: (outer.id = inner.project_id) -Index Scan using project_id_key on project p(cost=0.00..49.80 rows=800 width=36) (actual time=0.007..0.007 rows=0loops=1) -Sort(cost=45871.18..46318.16 rows=178792width=62) (actual time=4962.084..5475.636 rows=177769 loops=1) Sort Key: ac.project_id -Hash Join(cost=821.20..13193.43rows=178792 width=62) (actual time=174.905..4295.685 rows=177769 loops=1) Hash Cond: (outer.chart_id = inner.id) -Hash Join(cost=817.66..10508.02rows=178791width=20) (actual time=173.952..2840.824 rows=177769 loops=1) Hash Cond: (outer.trans_id =inner.id) -Seq Scan on acc_trans ac(cost=0.00..3304.38 rows=181538 width=12) (actual time=0.062..537.753rows=181322 loops=1) -Hash(cost=659.55..659.55rows=22844 width=8) (actual time=173.625..173.625 rows=0 loops=1) -Seq Scan on ar a(cost=0.00..659.55 rows=22844 width=8) (actual time=0.022..101.828rows=22844 loops=1) Filter: (customer_id= 11373) -Hash(cost=3.23..3.23 rows=123width=50) (actual time=0.915..0.915 rows=0 loops=1) -Seq Scan on chart c(cost=0.00..3.23 rows=123 width=50) (actual time=0.013..0.528 rows=123loops=1) -Hash(cost=716.67..716.67 rows=1 width=4) (actualtime=129.037..129.037 rows=0 loops=1) -Subquery Scan IN_subquery(cost=716.66..716.67 rows=1width=4) (actual time=129.017..129.025 rows=1 loops=1) -Aggregate(cost=716.66..716.66 rows=1 width=4)(actual time=129.008..129.011 rows=1 loops=1) -Seq Scan on ar(cost=0.00..659.55 rows=22844width=4) (actual time=0.020..73.266 rows=22844 loops=1) Filter: (customer_id = 11373) Total runtime: 9954.133 ms(28 rows)The shorter query's plan is:QUERY PLAN- Hash Join(cost=728.42..732.96 rows=8 width=50) (actualtime=130.908..131.593 rows=10 loops=1) Hash Cond: (outer.id = inner.chart_id) -Seq Scan on chart c(cost=0.00..3.23 rows=123 width=50) (actualtime=0.006..0.361 rows=123 loops=1) -Hash(cost=728.40..728.40 rows=8 width=8) (actualtime=130.841..130.841 rows=0 loops=1) -Nested Loop(cost=716.67..728.40 rows=8 width=8) (actualtime=130.692..130.805 rows=10 loops=1) -Nested Loop(cost=716.67..720.89 rows=1 width=8)(actual time=130.626..130.639 rows=1 loops=1) -HashAggregate(cost=716.67..716.67 rows=1width=4) (actual time=130.484..130.487 rows=1 loops=1) -Subquery Scan IN_subquery(cost=716.66..716.67 rows=1 width=4) (actual time=130.455..130.464rows=1 loops=1) -Aggregate(cost=716.66..716.66rows=1 width=4) (actual time=130.445..130.448 rows=1 loops=1) -Seq Scan on ar(cost=0.00..659.55 rows=22844 width=4) (actual time=0.020..74.174rows=22844 loops=1) Filter: (customer_id = 11373) -Index Scan using ar_id_key on ar a(cost=0.00..4.20 rows=1 width=4) (actual time=0.122..0.125 rows=1 loops=1) Index Cond: (a.id = outer.max) Filter: (customer_id = 11373) -Index Scan using acc_trans_trans_id_key on acc_transac(cost=0.00..7.41 rows=8 width=12) (actual
Re: [PERFORM] Left joining against two empty tables makes a query
Gnanavel S wrote: vacuum reindex the department and project table as the planner expects there are 1060 rows but actually returning nothing. I guess I should have mentioned that I have been vacuuming and reindexing at least once a week, and I did so just before running this test. Normally I do: vacuum analyze; reindex database ; Secondly, the project table has *never* had anything in it. So where are these numbers coming from? Best Wishes, Chris Travers Metatron Technology Consulting ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] Left joining against two empty tables makes a query
Chris Travers [EMAIL PROTECTED] writes: Secondly, the project table has *never* had anything in it. So where are these numbers coming from? The planner is designed to assume a certain minimum size (10 pages) when it sees that a table is of zero physical length. The reason for this is that there are lots of scenarios where a plan created just after a table is first created will continue to be used while the table is filled, and if we optimized on the assumption of zero size we would produce plans that seriously suck once the table gets big. Assuming a few thousand rows keeps us out of the worst problems of this type. (If we had an infrastructure for regenerating cached plans then we could fix this more directly, by replanning whenever the table size changes too much. We don't yet but I hope there will be something by 8.2.) You might try going ahead and actually putting a row or two into projects; vacuuming that will change the state to where the planner will believe the small size. (If you aren't ever planning to have anything in projects, why have the table at all?) regards, tom lane ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] Left joining against two empty tables makes a query SLOW
On 7/28/05, Chris Travers [EMAIL PROTECTED] wrote: Gnanavel S wrote: vacuum reindex the department and project table as the planner expects there are 1060 rows but actually returning nothing.I guess I should have mentioned that I have been vacuuming and reindexing at least once a week, and I did so just before running this test.Normally I do:vacuum analyze;reindex database ; reindex the tables separately. Secondly, the project table has *never* had anything in it.So whereare these numbers coming from? pg_statistics Best Wishes,Chris TraversMetatron Technology Consulting -- with regards,S.GnanavelSatyam Computer Services Ltd.
Re: [PERFORM] Left joining against two empty tables makes a query
Gnanavel S wrote: reindex the tables separately. Reindexing should not affect this problem, anyway. -Neil ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] Left joining against two empty tables makes a query
Secondly, the project table has *never* had anything in it. So where are these numbers coming from? pg_statistics I very much doubt that. I was unable to locate any rows in pg_statistic where the pg_class.oid for either table matched any row's starelid. Tom's argument that this is behavior by design makes sense. I assumed that something like that had to be going on, otherwise there would be nowhere for the numbers to come from. I.e. if there never were any rows in the table, then if pg_statistic is showing 1060 rows, we have bigger problems than a bad query plan. I hope however that eventually tables which are truly empty can be treated intelligently sometime in the future in Left Joins. Otherwise this limits the usefulness of out of the box solutions which may have functionality that we don't use. Such solutions can then kill the database performance quite easily. Chris Travers Metatron Technology Consulting ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
[PERFORM] Left joining against two empty tables makes a query SLOW
Hi all; I have a customer who currently uses an application which had become slow. After doing some digging, I found the slow query: SELECT c.accno, c.description, c.link, c.category, ac.project_id, p.projectnumber, a.department_id, d.description AS department FROM chart c JOIN acc_trans ac ON (ac.chart_id = c.id) JOIN ar a ON (a.id = ac.trans_id) LEFT JOIN project p ON (ac.project_id = p.id) LEFT JOIN department d ON (d.id = a.department_id) WHERE a.customer_id = 11373 AND a.id IN ( SELECT max(id) FROM ar WHERE customer_id = 11373 ); (reformatted for readability) This is taking 10 seconds to run. Interestingly, both the project and department tables are blank, and if I omit them, the query becomes: SELECT c.accno, c.description, c.link, c.category, ac.project_id FROM chart c JOIN acc_trans ac ON (ac.chart_id = c.id) JOIN ar a ON (a.id = ac.trans_id) WHERE a.customer_id = 11373 AND a.id IN ( SELECT max(id) FROM ar WHERE customer_id = 11373 ); This takes 139ms. 1% of the previous query. The plan for the long query is: QUERY PLAN Hash IN Join (cost=87337.25..106344.93 rows=41 width=118) (actual time=7615.843..9850.209 rows=10 loops=1) Hash Cond: (outer.trans_id = inner.max) - Merge Right Join (cost=86620.57..100889.85 rows=947598 width=126) (actual time=7408.830..9200.435 rows=177769 loops=1) Merge Cond: (outer.id = inner.department_id) - Index Scan using department_id_key on department d (cost=0.00..52.66 rows=1060 width=36) (actual time=0.090..0.090 rows=0 loops=1) - Sort (cost=86620.57..87067.55 rows=178792 width=94) (actual time=7408.709..7925.843 rows=177769 loops=1) Sort Key: a.department_id - Merge Right Join (cost=45871.18..46952.83 rows=178792 width=94) (actual time=4962.122..6671.319 rows=177769 loops=1) Merge Cond: (outer.id = inner.project_id) - Index Scan using project_id_key on project p (cost=0.00..49.80 rows=800 width=36) (actual time=0.007..0.007 rows=0 loops=1) - Sort (cost=45871.18..46318.16 rows=178792 width=62) (actual time=4962.084..5475.636 rows=177769 loops=1) Sort Key: ac.project_id - Hash Join (cost=821.20..13193.43 rows=178792 width=62) (actual time=174.905..4295.685 rows=177769 loops=1) Hash Cond: (outer.chart_id = inner.id) - Hash Join (cost=817.66..10508.02 rows=178791 width=20) (actual time=173.952..2840.824 rows=177769 loops=1) Hash Cond: (outer.trans_id = inner.id) - Seq Scan on acc_trans ac (cost=0.00..3304.38 rows=181538 width=12) (actual time=0.062..537.753 rows=181322 loops=1) - Hash (cost=659.55..659.55 rows=22844 width=8) (actual time=173.625..173.625 rows=0 loops=1) - Seq Scan on ar a (cost=0.00..659.55 rows=22844 width=8) (actual time=0.022..101.828 rows=22844 loops=1) Filter: (customer_id = 11373) - Hash (cost=3.23..3.23 rows=123 width=50) (actual time=0.915..0.915 rows=0 loops=1) - Seq Scan on chart c (cost=0.00..3.23 rows=123 width=50) (actual time=0.013..0.528 rows=123 loops=1) - Hash (cost=716.67..716.67 rows=1 width=4) (actual time=129.037..129.037 rows=0 loops=1) - Subquery Scan IN_subquery (cost=716.66..716.67 rows=1 width=4) (actual time=129.017..129.025 rows=1 loops=1) - Aggregate (cost=716.66..716.66 rows=1 width=4) (actual time=129.008..129.011 rows=1 loops=1) - Seq Scan on ar (cost=0.00..659.55 rows=22844 width=4) (actual time=0.020..73.266 rows=22844 loops=1) Filter: (customer_id = 11373) Total runtime: 9954.133 ms (28 rows) The shorter query's plan is: QUERY PLAN - Hash Join (cost=728.42..732.96 rows=8 width=50) (actual time=130.908..131.593 rows=10 loops=1) Hash Cond: (outer.id = inner.chart_id) - Seq Scan on chart c (cost=0.00..3.23 rows=123 width=50) (actual time=0.006..0.361 rows=123 loops=1) - Hash (cost=728.40..728.40 rows=8 width=8) (actual time=130.841..130.841 rows=0 loops=1) - Nested Loop