Re: [PERFORM] Planner not using column limit specified for one column for another column equal to first

2010-04-20 Thread Hannu Krosing
On Fri, 2010-04-16 at 11:02 +0300, Віталій Тимчишин wrote:
 Hello.
 
 
 I have a query that performs very poor because there is a limit on
 join column that is not applied to other columns:
 
 
 select * from company this_ left outer join company_tag this_1_ on
 this_.id=this_1_.company_id left outer join company_measures
 companymea2_ on this_.id=companymea2_.company_id left outer join
 company_descr ces3_ on this_.id=ces3_.company_id where this_1_.tag_id
 = 7 and this_.id5000 
 and this_1_.company_id5000
 order by this_.id asc limit 1000;
 
 
 (plan1.txt)
 Total runtime: 7794.692 ms
 
 
 At the same time if I apply the limit (5000) to other columns in
 query itself it works like a charm:
 
 
 select * from company this_ left outer join company_tag this_1_ on
 this_.id=this_1_.company_id left outer join company_measures
 companymea2_ on this_.id=companymea2_.company_id left outer join
 company_descr ces3_ on this_.id=ces3_.company_id where this_1_.tag_id
 = 7 and this_.id5000 
 and this_1_.company_id5000
 and companymea2_.company_id5000 and ces3_.company_id5000
 order by this_.id asc limit 1000;

The queries are not the same.

2nd variant will not return the rows where there are no matching rows
inthis_1_ , companymea2_ or ces3_.company_id

A query equivalent to first one would be:


select * from company this_ 
  left outer join company_tag this_1_ 
   on (this_.id=this_1_.company_id 
   and this_1_.company_id5000)
  left outer join company_measures companymea2_ 
   on (this_.id=companymea2_.company_id 
   and companymea2_.company_id5000)
  left outer join company_descr ces3_ 
   on (this_.id=ces3_.company_id 
   and ces3_.company_id5000)
 where this_1_.tag_id = 7 
   and this_.id5000 
 order by this_.id asc 
 limit 1000;


I'm not sure that planner considers the above form of plan rewrite, nor
that it would make much sense to do so unless there was a really small
number of rows where x_.company_id5000 


-- 
Hannu Krosing   http://www.2ndQuadrant.com
PostgreSQL Scalability and Availability 
   Services, Consulting and Training



-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Planner not using column limit specified for one column for another column equal to first

2010-04-19 Thread Matthew Wakeling

On Sat, 17 Apr 2010, Віталій Тимчишин wrote:

As of making planner more clever, may be it is possible to introduce
division on fast queries and long queries, so that if after fast
planning cost is greater then some configurable threshold, advanced planning
techniques (or settings) are used. As far as I have seen in this list, many
techniques are not used simply because they are too complex and could make
planning take too much time for really fast queries, but they are vital for
long ones.


+1. That's definitely a good idea in my view. The query optimiser I wrote 
(which sits on top of Postgres and makes use of materialised views to 
speed up queries) uses a similar approach - it expends effort proportional 
to the estimated cost of the query, as reported by EXPLAIN.


Matthew

--
To most people, solutions mean finding the answers. But to chemists,
solutions are things that are still all mixed up.
--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


[PERFORM] Planner not using column limit specified for one column for another column equal to first

2010-04-16 Thread Віталій Тимчишин
Hello.

I have a query that performs very poor because there is a limit on join
column that is not applied to other columns:

select * from company this_ left outer join company_tag this_1_ on
this_.id=this_1_.company_id left outer join company_measures companymea2_ on
this_.id=companymea2_.company_id left outer join company_descr ces3_ on
this_.id=ces3_.company_id where this_1_.tag_id = 7 and this_.id5000
and this_1_.company_id5000
order by this_.id asc limit 1000;

(plan1.txt)
Total runtime: 7794.692 ms

At the same time if I apply the limit (5000) to other columns in query
itself it works like a charm:

select * from company this_ left outer join company_tag this_1_ on
this_.id=this_1_.company_id left outer join company_measures companymea2_ on
this_.id=companymea2_.company_id left outer join company_descr ces3_ on
this_.id=ces3_.company_id where this_1_.tag_id = 7 and this_.id5000
and this_1_.company_id5000
and companymea2_.company_id5000 and ces3_.company_id5000
order by this_.id asc limit 1000;

(plan2.txt)
Total runtime: 27.547 ms

I've thought and someone in this list've told me that this should be done
automatically. But I have pretty recent server:
PostgreSQL 8.4.2 on amd64-portbld-freebsd8.0, compiled by GCC cc (GCC) 4.2.1
20070719  [FreeBSD], 64-bit
and it still do not work

Do I misunderstand something or this feature don't work in such a query?

Best regards, Vitalii Tymchyshyn
Limit  (cost=497.46..1464.50 rows=1000 width=693) (actual 
time=7767.721..7793.047 rows=1000 loops=1)
  -  Merge Left Join  (cost=497.46..5521612.64 rows=5709243 width=693) (actual 
time=7767.717..7790.274 rows=1000 loops=1)
Merge Cond: (this_.id = (companymea2_.company_id)::bigint)
-  Merge Left Join  (cost=404.31..4544508.54 rows=5709243 width=625) 
(actual time=4211.501..4227.215 rows=1000 loops=1)
  Merge Cond: (this_.id = (ces3_.company_id)::bigint)
  -  Merge Join  (cost=37.87..2435536.00 rows=5709243 width=123) 
(actual time=0.069..8.584 rows=1000 loops=1)
Merge Cond: (this_.id = (this_1_.company_id)::bigint)
-  Index Scan using pk_comp_m on company this_  
(cost=0.00..2139324.53 rows=6886598 width=107) (actual time=0.031..1.876 
rows=1054 loops=1)
  Index Cond: (id  5000)
-  Index Scan using company_tag_company_id_idx on 
company_tag this_1_  (cost=0.00..207678.85 rows=5709243 width=16) (actual 
time=0.027..1.757 rows=1000 loops=1)
  Index Cond: ((this_1_.company_id)::bigint  5000)
  Filter: ((this_1_.tag_id)::bigint = 7)
  -  Index Scan using company_descrs on company_descr ces3_  
(cost=0.00..2073526.89 rows=1996612 width=502) (actual time=0.014..2576.013 
rows=1097096 loops=1)
-  Index Scan using pk_comp_ms on company_measures companymea2_  
(cost=0.00..889427.81 rows=6821672 width=68) (actual time=0.020..1946.255 
rows=1097096 loops=1)
Total runtime: 7794.692 ms

Limit  (cost=330.57..4888.97 rows=1000 width=693) (actual time=0.125..26.011 
rows=1000 loops=1)
  -  Merge Join  (cost=330.57..5260651.29 rows=1153986 width=693) (actual 
time=0.122..23.215 rows=1000 loops=1)
Merge Cond: (this_.id = (this_1_.company_id)::bigint)
-  Merge Join  (cost=292.71..5023728.99 rows=1391960 width=677) 
(actual time=0.090..16.615 rows=1054 loops=1)
  Merge Cond: ((companymea2_.company_id)::bigint = this_.id)
  -  Merge Join  (cost=0.00..2846769.89 rows=1685017 width=570) 
(actual time=0.063..9.534 rows=1054 loops=1)
Merge Cond: ((companymea2_.company_id)::bigint = 
(ces3_.company_id)::bigint)
-  Index Scan using pk_comp_ms on company_measures 
companymea2_  (cost=0.00..848312.00 rows=565 width=68) (actual 
time=0.033..1.973 rows=1054 loops=1)
  Index Cond: ((company_id)::bigint  5000)
-  Index Scan using company_descrs on company_descr ces3_  
(cost=0.00..1963172.96 rows=1685017 width=502) (actual time=0.020..2.152 
rows=1054 loops=1)
  Index Cond: ((ces3_.company_id)::bigint  5000)
  -  Index Scan using pk_comp_m on company this_  
(cost=0.00..2139324.53 rows=6886598 width=107) (actual time=0.022..1.855 
rows=1054 loops=1)
Index Cond: (this_.id  5000)
-  Index Scan using company_tag_company_id_idx on company_tag this_1_  
(cost=0.00..207678.85 rows=5709243 width=16) (actual time=0.025..1.742 
rows=1000 loops=1)
  Index Cond: ((this_1_.company_id)::bigint  5000)
  Filter: ((this_1_.tag_id)::bigint = 7)
Total runtime: 27.547 ms


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Planner not using column limit specified for one column for another column equal to first

2010-04-16 Thread Yeb Havinga

Віталій Тимчишин wrote:

Hello.

I have a query that performs very poor because there is a limit on 
join column that is not applied to other columns:


select * from company this_ left outer join company_tag this_1_ on 
this_.id=this_1_.company_id left outer join company_measures 
companymea2_ on this_.id=companymea2_.company_id left outer join 
company_descr ces3_ on this_.id=ces3_.company_id where this_1_.tag_id 
= 7 and this_.id5000 
and this_1_.company_id5000

order by this_.id asc limit 1000;

(plan1.txt)
Total runtime: 7794.692 ms

At the same time if I apply the limit (5000) to other columns in 
query itself it works like a charm:


select * from company this_ left outer join company_tag this_1_ on 
this_.id=this_1_.company_id left outer join company_measures 
companymea2_ on this_.id=companymea2_.company_id left outer join 
company_descr ces3_ on this_.id=ces3_.company_id where this_1_.tag_id 
= 7 and this_.id5000 
and this_1_.company_id5000

and companymea2_.company_id5000 and ces3_.company_id5000
order by this_.id asc limit 1000;

(plan2.txt)
Total runtime: 27.547 ms

I've thought and someone in this list've told me that this should be 
done automatically.
Yes, if you have in a query a=b and b=c, then the optimizer figures out 
that a=c as well. (a,b and c are then member of the same equivalence class).


However both queries are not the same, since the joins you're using are 
outer joins. In the first it's possible that records are returned for 
company records with no matching ces3_ records, the ces3_ records is 
null in that case. In the second query no NULL ces3_ information may be 
returned.


Another thing is it seems that the number of rows guessed is far off 
from the actual number of rows, is the number 500 artificial or are 
you're statistics old or too small histogram/mcv's?


regards,
Yeb Havinga


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Planner not using column limit specified for one column for another column equal to first

2010-04-16 Thread Віталій Тимчишин
16 квітня 2010 р. 11:31 Yeb Havinga yebhavi...@gmail.com написав:

 Віталій Тимчишин wrote:

 Hello.

 I have a query that performs very poor because there is a limit on join
 column that is not applied to other columns:

 select * from company this_ left outer join company_tag this_1_ on
 this_.id=this_1_.company_id left outer join company_measures companymea2_ on
 this_.id=companymea2_.company_id left outer join company_descr ces3_ on
 this_.id=ces3_.company_id where this_1_.tag_id = 7 and this_.id5000 and
 this_1_.company_id5000
 order by this_.id asc limit 1000;

 (plan1.txt)
 Total runtime: 7794.692 ms

 At the same time if I apply the limit (5000) to other columns in
 query itself it works like a charm:

 select * from company this_ left outer join company_tag this_1_ on
 this_.id=this_1_.company_id left outer join company_measures companymea2_ on
 this_.id=companymea2_.company_id left outer join company_descr ces3_ on
 this_.id=ces3_.company_id where this_1_.tag_id = 7 and this_.id5000 and
 this_1_.company_id5000
 and companymea2_.company_id5000 and ces3_.company_id5000
 order by this_.id asc limit 1000;

 (plan2.txt)
 Total runtime: 27.547 ms

 I've thought and someone in this list've told me that this should be done
 automatically.

 Yes, if you have in a query a=b and b=c, then the optimizer figures out
 that a=c as well. (a,b and c are then member of the same equivalence class).

 However both queries are not the same, since the joins you're using are
 outer joins. In the first it's possible that records are returned for
 company records with no matching ces3_ records, the ces3_ records is null in
 that case. In the second query no NULL ces3_ information may be returned.


OK, but when I move limit to join condition the query is still fast:

select * from company this_ left outer join company_tag this_1_ on
this_.id=this_1_.company_id
left outer join company_measures companymea2_ on
this_.id=companymea2_.company_id and companymea2_.company_id5000
left outer join company_descr ces3_ on this_.id=ces3_.company_id and
ces3_.company_id5000
where this_1_.tag_id = 7 and this_.id5000
and this_1_.company_id5000
order by this_.id asc limit 1000;

(plan3.txt),
Total runtime: 26.327 ms
BTW: Changing slow query to inner joins do not make it fast



 Another thing is it seems that the number of rows guessed is far off from
 the actual number of rows, is the number 500 artificial or are you're
 statistics old or too small histogram/mcv's?


Nope, I suppose this is because of limit. If I remove the limit, the
estimations are quite correct. There are ~6 millions of row in each table.
Limit  (cost=293.40..1028.60 rows=1000 width=1209) (actual time=0.128..24.751 
rows=1000 loops=1)
  -  Merge Left Join  (cost=293.40..4197731.11 rows=5709243 width=1209) 
(actual time=0.124..21.968 rows=1000 loops=1)
Merge Cond: (this_.id = (companymea2_.company_id)::bigint)
-  Merge Left Join  (cost=246.59..3681230.10 rows=5709243 width=1141) 
(actual time=0.099..15.284 rows=1000 loops=1)
  Merge Cond: (this_.id = (ces3_.company_id)::bigint)
  -  Merge Join  (cost=37.87..2435536.00 rows=5709243 width=639) 
(actual time=0.074..8.487 rows=1000 loops=1)
Merge Cond: (this_.id = (this_1_.company_id)::bigint)
-  Index Scan using pk_comp_m on company this_  
(cost=0.00..2139324.53 rows=6886598 width=623) (actual time=0.037..1.860 
rows=1054 loops=1)
  Index Cond: (id  5000)
-  Index Scan using company_tag_company_id_idx on 
company_tag this_1_  (cost=0.00..207678.85 rows=5709243 width=16) (actual 
time=0.027..1.758 rows=1000 loops=1)
  Index Cond: ((this_1_.company_id)::bigint  5000)
  Filter: ((this_1_.tag_id)::bigint = 7)
  -  Index Scan using cws_company_descr_unique on company_descr 
ces3_  (cost=0.00..1169511.92 rows=5757068 width=502) (actual time=0.020..1.788 
rows=1054 loops=1)
Index Cond: ((ces3_.company_id)::bigint  5000)
-  Index Scan using pk_comp_ms on company_measures companymea2_  
(cost=0.00..440945.79 rows=565 width=68) (actual time=0.019..1.729 
rows=1054 loops=1)
  Index Cond: ((companymea2_.company_id)::bigint  5000)
Total runtime: 26.327 ms


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Planner not using column limit specified for one column for another column equal to first

2010-04-16 Thread Віталій Тимчишин
16 квітня 2010 р. 11:25 Hannu Krosing ha...@2ndquadrant.com написав:

 On Fri, 2010-04-16 at 11:02 +0300, Віталій Тимчишин wrote:
  Hello.
 
 
  I have a query that performs very poor because there is a limit on
  join column that is not applied to other columns:
 
 
  select * from company this_ left outer join company_tag this_1_ on
  this_.id=this_1_.company_id left outer join company_measures
  companymea2_ on this_.id=companymea2_.company_id left outer join
  company_descr ces3_ on this_.id=ces3_.company_id where this_1_.tag_id
  = 7 and this_.id5000
  and this_1_.company_id5000
  order by this_.id asc limit 1000;
 
 
  (plan1.txt)
  Total runtime: 7794.692 ms
 
 
  At the same time if I apply the limit (5000) to other columns in
  query itself it works like a charm:
 
 
  select * from company this_ left outer join company_tag this_1_ on
  this_.id=this_1_.company_id left outer join company_measures
  companymea2_ on this_.id=companymea2_.company_id left outer join
  company_descr ces3_ on this_.id=ces3_.company_id where this_1_.tag_id
  = 7 and this_.id5000
  and this_1_.company_id5000
  and companymea2_.company_id5000 and ces3_.company_id5000
  order by this_.id asc limit 1000;

 The queries are not the same.

 2nd variant will not return the rows where there are no matching rows
 inthis_1_ , companymea2_ or ces3_.company_id

 A query equivalent to first one would be:


 select * from company this_
  left outer join company_tag this_1_
   on (this_.id=this_1_.company_id
and this_1_.company_id5000)
   left outer join company_measures companymea2_
   on (this_.id=companymea2_.company_id
and companymea2_.company_id5000)
   left outer join company_descr ces3_
   on (this_.id=ces3_.company_id
and ces3_.company_id5000)
  where this_1_.tag_id = 7
   and this_.id5000
  order by this_.id asc
  limit 1000;


And it's still fast (see plan in another mail), while inner join variant
of original query is still slow.




 I'm not sure that planner considers the above form of plan rewrite, nor
 that it would make much sense to do so unless there was a really small
 number of rows where x_.company_id5000

 Actually no,
select id  5000, count(*) from company group by 1
f,1096042
t,5725630

I don't know why the planner wishes to perform few merges of 1000 to a
million of records (and the merges is the thing that takes time) instead of
taking a 1000 of records from main table and then doing a nested loop. And
it must read all the records that DO NOT match the criteria for secondary
tables before getting to correct records if it do not filter secondary
tables with index on retrieve.

set enable_mergejoin=false helps original query, but this is another problem
and first solution is simpler and can be used by planner automatically,
while second requires rethinking/rewrite of LIMIT estimation logic
(Plan of nested loop attached)
Limit  (cost=0.00..2369.36 rows=1000 width=1209) (actual time=0.179..41.155 
rows=1000 loops=1)
  -  Nested Loop  (cost=0.00..15727940.41 rows=6638046 width=1209) (actual 
time=0.174..38.312 rows=1000 loops=1)
-  Nested Loop  (cost=0.00..11165483.75 rows=6701224 width=1141) 
(actual time=0.134..26.421 rows=1000 loops=1)
  -  Nested Loop  (cost=0.00..5763844.03 rows=6765004 width=639) 
(actual time=0.066..14.389 rows=1000 loops=1)
-  Index Scan using pk_comp_m on company this_  
(cost=0.00..1152936.77 rows=6886598 width=623) (actual time=0.038..1.908 
rows=1054 loops=1)
  Index Cond: (id  5000)
-  Index Scan using company_tag_company_id_idx on 
company_tag this_1_  (cost=0.00..0.66 rows=1 width=16) (actual 
time=0.004..0.006 rows=1 loops=1054)
  Index Cond: ((this_1_.company_id)::bigint = this_.id)
  Filter: ((this_1_.tag_id)::bigint = 7)
  -  Index Scan using cws_company_descr_unique on company_descr 
ces3_  (cost=0.00..0.79 rows=1 width=502) (actual time=0.004..0.006 rows=1 
loops=1000)
Index Cond: ((ces3_.company_id)::bigint = this_.id)
-  Index Scan using pk_comp_ms on company_measures companymea2_  
(cost=0.00..0.67 rows=1 width=68) (actual time=0.004..0.006 rows=1 loops=1000)
  Index Cond: ((companymea2_.company_id)::bigint = this_.id)
Total runtime: 42.940 ms


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Planner not using column limit specified for one column for another column equal to first

2010-04-16 Thread Yeb Havinga

Віталій Тимчишин wrote:


BTW: Changing slow query to inner joins do not make it fast

I'm interested to see the query andplan of the slow query with inner joins.




Another thing is it seems that the number of rows guessed is far
off from the actual number of rows, is the number 500
artificial or are you're statistics old or too small histogram/mcv's?


Nope, I suppose this is because of limit. If I remove the limit, the 
estimations are quite correct. There are ~6 millions of row in each table.

Yes, that makes sense.

regards,
Yeb Havinga


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Planner not using column limit specified for one column for another column equal to first

2010-04-16 Thread Віталій Тимчишин
16 квітня 2010 р. 16:21 Yeb Havinga yebhavi...@gmail.com написав:

 Віталій Тимчишин wrote:


 BTW: Changing slow query to inner joins do not make it fast

 I'm interested to see the query andplan of the slow query with inner joins.


 Here you are. The query:

select * from company this_ inner join company_tag this_1_ on
this_.id=this_1_.company_id
inner join company_measures companymea2_ on
this_.id=companymea2_.company_id
inner join company_descr ces3_ on this_.id=ces3_.company_id
where this_1_.tag_id = 7 and this_.id5000
order by this_.id asc
limit 1000
;
Total runtime: 14088.942 ms
(plan is attached)

Best regards, Vitalii Tymchyshyn
Limit  (cost=227.15..883.22 rows=1000 width=1209) (actual 
time=14062.106..14087.375 rows=1000 loops=1)
  -  Merge Join  (cost=227.15..4355277.70 rows=6638046 width=1209) (actual 
time=14062.101..14084.577 rows=1000 loops=1)
Merge Cond: (this_.id = (this_1_.company_id)::bigint)
-  Merge Join  (cost=185.14..4025978.59 rows=6757358 width=1193) 
(actual time=10692.975..10708.923 rows=1054 loops=1)
  Merge Cond: ((companymea2_.company_id)::bigint = this_.id)
  -  Merge Join  (cost=0.00..1784574.44 rows=6821672 width=570) 
(actual time=0.111..9138.804 rows=1097096 loops=1)
Merge Cond: ((companymea2_.company_id)::bigint = 
(ces3_.company_id)::bigint)
-  Index Scan using pk_comp_ms on company_measures 
companymea2_  (cost=0.00..456350.36 rows=6821672 width=68) (actual 
time=0.066..1747.291 rows=1097096 loops=1)
-  Index Scan using cws_company_descr_unique on 
company_descr ces3_  (cost=0.00..1225899.00 rows=6821672 width=502) (actual 
time=0.033..1822.085 rows=1097096 loops=1)
  -  Index Scan using pk_comp_m on company this_  
(cost=0.00..2139324.53 rows=6886598 width=623) (actual time=0.026..1.853 
rows=1054 loops=1)
Index Cond: (this_.id  5000)
-  Index Scan using company_tag_company_id_idx on company_tag this_1_  
(cost=0.00..229167.56 rows=6765004 width=16) (actual time=0.028..1859.967 
rows=1075634 loops=1)
  Filter: ((this_1_.tag_id)::bigint = 7)
Total runtime: 14088.942 ms


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Planner not using column limit specified for one column for another column equal to first

2010-04-16 Thread Tom Lane
=?KOI8-U?B?96bUwcymyiD0yc3eydvJzg==?= tiv...@gmail.com writes:
 I've thought and someone in this list've told me that this should be done
 automatically.

No, that's not true.  We do make deductions about transitive equalities,
ie, given WHERE a=b AND b=c the planner will infer a=c and use that if
it's helpful.  We don't make deductions about inequalities such as ac.
In theory there's enough information available to do so, but overall
trying to do that would probably waste more cycles than it would save.
You'd need a lot of expensive new planner infrastructure, and in the
vast majority of queries it wouldn't produce anything very helpful.

As was pointed out, even if we had such logic it wouldn't apply in this
example, because the equality conditions aren't real equalities but
OUTER JOIN conditions.

regards, tom lane

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Planner not using column limit specified for one column for another column equal to first

2010-04-16 Thread Yeb Havinga

Tom Lane wrote:

=?KOI8-U?B?96bUwcymyiD0yc3eydvJzg==?= tiv...@gmail.com writes:
  

I've thought and someone in this list've told me that this should be done
automatically.



No, that's not true.  We do make deductions about transitive equalities,
ie, given WHERE a=b AND b=c the planner will infer a=c and use that if
it's helpful.  We don't make deductions about inequalities such as ac.
In theory there's enough information available to do so, but overall
trying to do that would probably waste more cycles than it would save.
You'd need a lot of expensive new planner infrastructure, and in the
vast majority of queries it wouldn't produce anything very helpful.
  
New expensive planner infrastructure to support from ab and bc infer 
ac, yes.


But I wonder if something like Leibniz's principle of identity holds for 
members of the same equivalence class, e.g. like if x,y are both members 
of the same EC, then for every predicate P, P(x) iff P(y). Probably not 
for every predicate (like varno = 2 or attname='x'), but for the query 
evaluation, the object denoted by the variables are the same, since that 
is the standard meaning of the = operator. I cannot think of any 
standard (btree) operator where 'Leibniz' would fail in this case.


regards,
Yeb Havinga



--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Planner not using column limit specified for one column for another column equal to first

2010-04-16 Thread Tom Lane
Yeb Havinga yebhavi...@gmail.com writes:
 New expensive planner infrastructure to support from ab and bc infer 
 ac, yes.

 But I wonder if something like Leibniz's principle of identity holds for 
 members of the same equivalence class, e.g. like if x,y are both members 
 of the same EC, then for every predicate P, P(x) iff P(y).

This could only be assumed to apply for predicates constructed from
operators that are in the equivalence operator's btree opfamily.
Now, that would certainly be a large enough set of cases to sometimes
give useful results --- but I stand by the opinion that it wouldn't
win often enough to justify the added planner overhead.

regards, tom lane

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance