16 квітня 2010 р. 11:31 Yeb Havinga <yebhavi...@gmail.com> написав:
> Віталій Тимчишин wrote: > >> Hello. >> >> I have a query that performs very poor because there is a limit on join >> column that is not applied to other columns: >> >> select * from company this_ left outer join company_tag this_1_ on >> this_.id=this_1_.company_id left outer join company_measures companymea2_ on >> this_.id=companymea2_.company_id left outer join company_descr ces3_ on >> this_.id=ces3_.company_id where this_1_.tag_id = 7 and this_.id>50000000 and >> this_1_.company_id>50000000 >> order by this_.id asc limit 1000; >> >> (plan1.txt) >> Total runtime: 7794.692 ms >> >> At the same time if I apply the limit (>50000000) to other columns in >> query itself it works like a charm: >> >> select * from company this_ left outer join company_tag this_1_ on >> this_.id=this_1_.company_id left outer join company_measures companymea2_ on >> this_.id=companymea2_.company_id left outer join company_descr ces3_ on >> this_.id=ces3_.company_id where this_1_.tag_id = 7 and this_.id>50000000 and >> this_1_.company_id>50000000 >> and companymea2_.company_id>50000000 and ces3_.company_id>50000000 >> order by this_.id asc limit 1000; >> >> (plan2.txt) >> Total runtime: 27.547 ms >> >> I've thought and someone in this list've told me that this should be done >> automatically. >> > Yes, if you have in a query a=b and b=c, then the optimizer figures out > that a=c as well. (a,b and c are then member of the same equivalence class). > > However both queries are not the same, since the joins you're using are > outer joins. In the first it's possible that records are returned for > company records with no matching ces3_ records, the ces3_ records is null in > that case. In the second query no NULL ces3_ information may be returned. > OK, but when I move limit to join condition the query is still fast: select * from company this_ left outer join company_tag this_1_ on this_.id=this_1_.company_id left outer join company_measures companymea2_ on this_.id=companymea2_.company_id and companymea2_.company_id>50000000 left outer join company_descr ces3_ on this_.id=ces3_.company_id and ces3_.company_id>50000000 where this_1_.tag_id = 7 and this_.id>50000000 and this_1_.company_id>50000000 order by this_.id asc limit 1000; (plan3.txt), Total runtime: 26.327 ms BTW: Changing slow query to inner joins do not make it fast > > Another thing is it seems that the number of rows guessed is far off from > the actual number of rows, is the number 5000000 artificial or are you're > statistics old or too small histogram/mcv's? > Nope, I suppose this is because of limit. If I remove the limit, the estimations are quite correct. There are ~6 millions of row in each table.
Limit (cost=293.40..1028.60 rows=1000 width=1209) (actual time=0.128..24.751 rows=1000 loops=1) -> Merge Left Join (cost=293.40..4197731.11 rows=5709243 width=1209) (actual time=0.124..21.968 rows=1000 loops=1) Merge Cond: (this_.id = (companymea2_.company_id)::bigint) -> Merge Left Join (cost=246.59..3681230.10 rows=5709243 width=1141) (actual time=0.099..15.284 rows=1000 loops=1) Merge Cond: (this_.id = (ces3_.company_id)::bigint) -> Merge Join (cost=37.87..2435536.00 rows=5709243 width=639) (actual time=0.074..8.487 rows=1000 loops=1) Merge Cond: (this_.id = (this_1_.company_id)::bigint) -> Index Scan using pk_comp_m on company this_ (cost=0.00..2139324.53 rows=6886598 width=623) (actual time=0.037..1.860 rows=1054 loops=1) Index Cond: (id > 50000000) -> Index Scan using company_tag_company_id_idx on company_tag this_1_ (cost=0.00..207678.85 rows=5709243 width=16) (actual time=0.027..1.758 rows=1000 loops=1) Index Cond: ((this_1_.company_id)::bigint > 50000000) Filter: ((this_1_.tag_id)::bigint = 7) -> Index Scan using cws_company_descr_unique on company_descr ces3_ (cost=0.00..1169511.92 rows=5757068 width=502) (actual time=0.020..1.788 rows=1054 loops=1) Index Cond: ((ces3_.company_id)::bigint > 50000000) -> Index Scan using pk_comp_ms on company_measures companymea2_ (cost=0.00..440945.79 rows=5688885 width=68) (actual time=0.019..1.729 rows=1054 loops=1) Index Cond: ((companymea2_.company_id)::bigint > 50000000) Total runtime: 26.327 ms
-- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance