Re: [HACKERS] Execute query with EXCEPT, INTERSECT as anti-join, join?

2013-11-11 Thread Tom Lane
Colin 't Hart colinth...@gmail.com writes:
 I can't get Postgresql to execute a query with EXCEPT (or INTERSECT)
 as an anti-join (or join).

 Is this even possible?

No, and it probably won't ever be, since the semantics aren't the same.
EXCEPT/INTERSECT imply duplicate elimination.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Execute query with EXCEPT, INTERSECT as anti-join, join?

2013-11-11 Thread Colin 't Hart
On 11 November 2013 14:34, Tom Lane t...@sss.pgh.pa.us wrote:
 Colin 't Hart colinth...@gmail.com writes:
 I can't get Postgresql to execute a query with EXCEPT (or INTERSECT)
 as an anti-join (or join).

 Is this even possible?

 No, and it probably won't ever be, since the semantics aren't the same.
 EXCEPT/INTERSECT imply duplicate elimination.

Can't we just use DISTINCT for that?

Given a query

query_1 EXCEPT query_2

isn't it always possible to rewrite this as

select distinct * from (query_1) q1 where not exists (select 1 from
(query_2) q2 where q1.col1 = q2.col1 and q1.col2 = c2.col2 and ...
and q1.colN = q2.colN)

?


Regards,

Colin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Execute query with EXCEPT, INTERSECT as anti-join, join?

2013-11-11 Thread Tom Lane
Colin 't Hart co...@sharpheart.org writes:
 On 11 November 2013 14:34, Tom Lane t...@sss.pgh.pa.us wrote:
 No, and it probably won't ever be, since the semantics aren't the same.
 EXCEPT/INTERSECT imply duplicate elimination.

 Can't we just use DISTINCT for that?

If you have to do a DISTINCT it's not clear to me that you're going to get
much win.

(The bigger picture here is that pretty much zero optimization effort has
been spent on EXCEPT/INTERSECT, because they're just not used that much
compared to other places where we could put that effort.)

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Execute query with EXCEPT, INTERSECT as anti-join, join?

2013-11-11 Thread Colin 't Hart
On 11 November 2013 15:16, Tom Lane t...@sss.pgh.pa.us wrote:
 Colin 't Hart co...@sharpheart.org writes:
 On 11 November 2013 14:34, Tom Lane t...@sss.pgh.pa.us wrote:
 No, and it probably won't ever be, since the semantics aren't the same.
 EXCEPT/INTERSECT imply duplicate elimination.

 Can't we just use DISTINCT for that?

 If you have to do a DISTINCT it's not clear to me that you're going to get
 much win.

 (The bigger picture here is that pretty much zero optimization effort has
 been spent on EXCEPT/INTERSECT, because they're just not used that much
 compared to other places where we could put that effort.)

I'm asking because I just encountered several cases where the
anti-join was *much* faster. In each case query_1's result was
relatively small compared to query_2's result or the related rows
from query_2 were a much smaller set than the whole result of
query_2. In these cases, when the executor new how the two halves of
the query were related -- and that's the crux here: by writing EXCEPT
the executor couldn't determine how the two halves of the query were
related -- the anti-join was about 1000 times faster.

I think it's similar to the NOT IN which most DBMSes solved about 20
years ago but before that everyone used to rewrite by hand as NOT
EXISTS: sometimes we want to write query as EXCEPT because it's
clearer but execute it as an anti-join with DISTINCT.

Would these be difficult to build in? While I know a lot about how
DBMS engines work I've not hacked at PG internals. I'd be more than
willing to look at it, but could use some pointers as to where to
start. In particular, does PG rewrite queries in any way? Is it
possible to simply rewrite the query and then pass to the optimizer
to see if it would result in a better plan?

if I can improve EXCEPT, the same could also be applied to INTERSECT
as a DISTINCT join.

Thanks  regards,

Colin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Execute query with EXCEPT, INTERSECT as anti-join, join?

2013-11-11 Thread Tom Lane
Colin 't Hart colinth...@gmail.com writes:
 Would these be difficult to build in?

Well, you'd have to worry about the ALL cases, as well as how to determine
whether you're actually getting a win (which would probably be rather
tough, really, as the choice would have to be made before we've fired up
any of the planner machinery that supports statistical estimation :-().
The code that plans this is in src/backend/optimizer/prep/prepunion.c.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers