16 квітня 2010 р. 11:31 Yeb Havinga <yebhavi...@gmail.com> написав:

> Віталій Тимчишин wrote:
>
>> Hello.
>>
>> I have a query that performs very poor because there is a limit on join
>> column that is not applied to other columns:
>>
>> select * from company this_ left outer join company_tag this_1_ on
>> this_.id=this_1_.company_id left outer join company_measures companymea2_ on
>> this_.id=companymea2_.company_id left outer join company_descr ces3_ on
>> this_.id=ces3_.company_id where this_1_.tag_id = 7 and this_.id>50000000 and
>> this_1_.company_id>50000000
>> order by this_.id asc limit 1000;
>>
>> (plan1.txt)
>> Total runtime: 7794.692 ms
>>
>> At the same time if I apply the limit (>50000000) to other columns in
>> query itself it works like a charm:
>>
>> select * from company this_ left outer join company_tag this_1_ on
>> this_.id=this_1_.company_id left outer join company_measures companymea2_ on
>> this_.id=companymea2_.company_id left outer join company_descr ces3_ on
>> this_.id=ces3_.company_id where this_1_.tag_id = 7 and this_.id>50000000 and
>> this_1_.company_id>50000000
>> and companymea2_.company_id>50000000 and ces3_.company_id>50000000
>> order by this_.id asc limit 1000;
>>
>> (plan2.txt)
>> Total runtime: 27.547 ms
>>
>> I've thought and someone in this list've told me that this should be done
>> automatically.
>>
> Yes, if you have in a query a=b and b=c, then the optimizer figures out
> that a=c as well. (a,b and c are then member of the same equivalence class).
>
> However both queries are not the same, since the joins you're using are
> outer joins. In the first it's possible that records are returned for
> company records with no matching ces3_ records, the ces3_ records is null in
> that case. In the second query no NULL ces3_ information may be returned.
>

OK, but when I move limit to join condition the query is still fast:

select * from company this_ left outer join company_tag this_1_ on
this_.id=this_1_.company_id
left outer join company_measures companymea2_ on
this_.id=companymea2_.company_id and companymea2_.company_id>50000000
left outer join company_descr ces3_ on this_.id=ces3_.company_id and
ces3_.company_id>50000000
where this_1_.tag_id = 7 and this_.id>50000000
and this_1_.company_id>50000000
order by this_.id asc limit 1000;

(plan3.txt),
Total runtime: 26.327 ms
BTW: Changing slow query to inner joins do not make it fast


>
> Another thing is it seems that the number of rows guessed is far off from
> the actual number of rows, is the number 5000000 artificial or are you're
> statistics old or too small histogram/mcv's?
>

Nope, I suppose this is because of limit. If I remove the limit, the
estimations are quite correct. There are ~6 millions of row in each table.
Limit  (cost=293.40..1028.60 rows=1000 width=1209) (actual time=0.128..24.751 
rows=1000 loops=1)
  ->  Merge Left Join  (cost=293.40..4197731.11 rows=5709243 width=1209) 
(actual time=0.124..21.968 rows=1000 loops=1)
        Merge Cond: (this_.id = (companymea2_.company_id)::bigint)
        ->  Merge Left Join  (cost=246.59..3681230.10 rows=5709243 width=1141) 
(actual time=0.099..15.284 rows=1000 loops=1)
              Merge Cond: (this_.id = (ces3_.company_id)::bigint)
              ->  Merge Join  (cost=37.87..2435536.00 rows=5709243 width=639) 
(actual time=0.074..8.487 rows=1000 loops=1)
                    Merge Cond: (this_.id = (this_1_.company_id)::bigint)
                    ->  Index Scan using pk_comp_m on company this_  
(cost=0.00..2139324.53 rows=6886598 width=623) (actual time=0.037..1.860 
rows=1054 loops=1)
                          Index Cond: (id > 50000000)
                    ->  Index Scan using company_tag_company_id_idx on 
company_tag this_1_  (cost=0.00..207678.85 rows=5709243 width=16) (actual 
time=0.027..1.758 rows=1000 loops=1)
                          Index Cond: ((this_1_.company_id)::bigint > 50000000)
                          Filter: ((this_1_.tag_id)::bigint = 7)
              ->  Index Scan using cws_company_descr_unique on company_descr 
ces3_  (cost=0.00..1169511.92 rows=5757068 width=502) (actual time=0.020..1.788 
rows=1054 loops=1)
                    Index Cond: ((ces3_.company_id)::bigint > 50000000)
        ->  Index Scan using pk_comp_ms on company_measures companymea2_  
(cost=0.00..440945.79 rows=5688885 width=68) (actual time=0.019..1.729 
rows=1054 loops=1)
              Index Cond: ((companymea2_.company_id)::bigint > 50000000)
Total runtime: 26.327 ms

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Reply via email to