Re: [HACKERS] WIP: patch to create explicit support for semi and anti joins
Tom Lane [EMAIL PROTECTED] wrote: Introduce JOIN_SEMI and JOIN_ANTI join types, the former replacing JOIN_IN. Unify the InClauseInfo and OuterJoinInfo infrastructure into SpecialJoinInfo. Convert IN, EXISTS, and NOT EXISTS clauses at top level of WHERE into semi and anti joins respectively. It just struck me that this may cause additional joins to count against the join_collapse_limit. If so, that could cause some borderline queries to optimize poorly if the limit isn't raised. Is that a reasonable concern? Possibly something to document in the release notes? -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP: patch to create explicit support for semi and anti joins
Kevin Grittner [EMAIL PROTECTED] writes: Tom Lane [EMAIL PROTECTED] wrote: Introduce JOIN_SEMI and JOIN_ANTI join types, the former replacing JOIN_IN. It just struck me that this may cause additional joins to count against the join_collapse_limit. If so, that could cause some borderline queries to optimize poorly if the limit isn't raised. Is that a reasonable concern? Possibly something to document in the release notes? I don't think we should change anything there. The collapse limit settings are mainly driven by not wanting to get into an exponential growth of planning time, and the way in which join situations are created doesn't really affect the appropriate value one way or the other. In a case where this did happen, you'd just have exchanged one suboptimal planning situation for another, so I'm unconvinced that it'd make matters any worse compared to prior releases. (That does seem like a point worth testing, though, if you want to.) regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP: patch to create explicit support for semi and anti joins
On Wed, 2008-08-13 at 23:12 -0400, Tom Lane wrote: We're just trying to provide better performance for certain common SQL idioms. Sounds good, but can you explain how this will help? Not questioning it, just after more information about it. I'm half way through join removal patch, so this work might extend the join elimination to semi/anti joins also (hopefully), or it might (hopefully not) prevent the join elimination altogether. I'll let you know how I get on. -- Simon Riggs www.2ndQuadrant.com PostgreSQL Training, Services and Support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP: patch to create explicit support for semi and anti joins
Simon Riggs [EMAIL PROTECTED] writes: On Wed, 2008-08-13 at 23:12 -0400, Tom Lane wrote: We're just trying to provide better performance for certain common SQL idioms. Sounds good, but can you explain how this will help? 1. Allowing optimization of EXISTS/NOT EXISTS as general-purpose joins. Up to now, the best plan you could hope for was the equivalent of a nestloop with inner indexscan, with the EXISTS subquery on the inside. While that's not necessarily bad, it could be pretty bad for a large outer table. Now we'll have the option to consider merge and hash joins too. 2. Allowing the planner to get better estimates of the result size of these special join types. In the past we've had to kluge that, and the results weren't always good. Part of what I'm doing (the unfinished part ;-)) is to make more information about join context available to selectivity estimation functions, which is something we've known we needed for awhile. I can't yet *prove* that I can get better estimates with the added info, but if not, that just means I need to rethink what to pass down exactly. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP: patch to create explicit support for semi and anti joins
On Thu, 2008-08-14 at 10:04 -0400, Tom Lane wrote: Simon Riggs [EMAIL PROTECTED] writes: On Wed, 2008-08-13 at 23:12 -0400, Tom Lane wrote: We're just trying to provide better performance for certain common SQL idioms. Sounds good, but can you explain how this will help? 1. Allowing optimization of EXISTS/NOT EXISTS as general-purpose joins. Up to now, the best plan you could hope for was the equivalent of a nestloop with inner indexscan, with the EXISTS subquery on the inside. While that's not necessarily bad, it could be pretty bad for a large outer table. Now we'll have the option to consider merge and hash joins too. 2. Allowing the planner to get better estimates of the result size of these special join types. In the past we've had to kluge that, and the results weren't always good. Part of what I'm doing (the unfinished part ;-)) is to make more information about join context available to selectivity estimation functions, which is something we've known we needed for awhile. I can't yet *prove* that I can get better estimates with the added info, but if not, that just means I need to rethink what to pass down exactly. OK, that sounds good. Are you also working on transforming NOT IN into different form? Or is that the same thing as (1)? -- Simon Riggs www.2ndQuadrant.com PostgreSQL Training, Services and Support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP: patch to create explicit support for semi and anti joins
Simon Riggs [EMAIL PROTECTED] writes: OK, that sounds good. Are you also working on transforming NOT IN into different form? Or is that the same thing as (1)? I'm not currently thinking about NOT IN. It could be transformed to an antijoin if we could prove that no nulls are involved, but that seems less than trivial as I noted earlier. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP: patch to create explicit support for semi and anti joins
Tom Lane [EMAIL PROTECTED] wrote: I can't yet *prove* that I can get better estimates with the added info, but if not, that just means I need to rethink what to pass down exactly. I'll see if I can do some testing here to confirm plan improvements and check estimate accuracy. This is only on the trunk, not any stable branches? I assume that effort should wait until you have a candidate for the estimate improvements? -Kevin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP: patch to create explicit support for semi and anti joins
Kevin Grittner [EMAIL PROTECTED] writes: Tom Lane [EMAIL PROTECTED] wrote: I can't yet *prove* that I can get better estimates with the added info, but if not, that just means I need to rethink what to pass down exactly. I'll see if I can do some testing here to confirm plan improvements and check estimate accuracy. This is only on the trunk, not any stable branches? I assume that effort should wait until you have a candidate for the estimate improvements? Yeah. If you feel like it you can test what I just committed, but it's definitely not where I expect to end up in another few days. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP: patch to create explicit support for semi and anti joins
On Aug 13, 2008, at 17:31, Tom Lane wrote: What's done: Introduce JOIN_SEMI and JOIN_ANTI join types, the former replacing JOIN_IN. Unify the InClauseInfo and OuterJoinInfo infrastructure into SpecialJoinInfo. Convert IN, EXISTS, and NOT EXISTS clauses at top level of WHERE into semi and anti joins respectively. Recognize LEFT JOIN with a suitable IS NULL filter condition as an anti join. This all compiles and passes the regression tests. Wow. That sound awesome, Tom. Stupid question: Do these join types have some sort of correspondence to the SQL standard? Or would they be specific to PostgreSQL? Or is this just something that's under the hood an not actually a change to the syntax of SQL joins? What's not done: nodeMergejoin.c doesn't yet handle JOIN_ANTI. (This is just a SMOP, but it's a lot more complicated than the nestloop or hash logic, and I figured nestloop and hash were enough for testing the planner.) I guess that means you plan to do it once there has been significant testing with nestloop and hash and when the selectivity stuff is done? Best, David (Who is in over his head, but decides to stick his toe in the water anyway.) -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP: patch to create explicit support for semi and anti joins
David E. Wheeler [EMAIL PROTECTED] writes: On Aug 13, 2008, at 17:31, Tom Lane wrote: Introduce JOIN_SEMI and JOIN_ANTI join types, Wow. That sound awesome, Tom. Stupid question: Do these join types have some sort of correspondence to the SQL standard? Semi and anti joins are pretty standard concepts in relational theory, but they have no direct mapping in the SQL join syntax. You can write them with certain well-known locutions, though: IN and EXISTS, with certain restrictions, represent semi join NOT EXISTS, with certain restrictions, represents anti join LEFT JOIN with an incompatible higher IS NULL test represents anti join Basically what this patch is about is teaching the planner that these constructs are best understood via the relational-theory concepts. We'd been doing it in a pretty ad-hoc way before, and run into a lot of problems that we've had to kluge around. I think that this approach provides a structure that will actually work well. Or is this just something that's under the hood an not actually a change to the syntax of SQL joins? Right, there's no user visible feature or syntax change here. We're just trying to provide better performance for certain common SQL idioms. What's not done: nodeMergejoin.c doesn't yet handle JOIN_ANTI. (This is just a SMOP, I guess that means you plan to do it once there has been significant testing with nestloop and hash and when the selectivity stuff is done? Actually, I got it done an hour or so ago --- it turned out to be easier than I thought. It just didn't seem like part of the critical path for the patch, so I'd been willing to let it go till later. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP: patch to create explicit support for semi and anti joins
On Aug 13, 2008, at 20:12, Tom Lane wrote: Wow. That sound awesome, Tom. Stupid question: Do these join types have some sort of correspondence to the SQL standard? Semi and anti joins are pretty standard concepts in relational theory, but they have no direct mapping in the SQL join syntax. You can write them with certain well-known locutions, though: IN and EXISTS, with certain restrictions, represent semi join NOT EXISTS, with certain restrictions, represents anti join LEFT JOIN with an incompatible higher IS NULL test represents anti join Basically what this patch is about is teaching the planner that these constructs are best understood via the relational-theory concepts. We'd been doing it in a pretty ad-hoc way before, and run into a lot of problems that we've had to kluge around. I think that this approach provides a structure that will actually work well. Great. Thanks for the explanation, Tom, as always. Or is this just something that's under the hood an not actually a change to the syntax of SQL joins? Right, there's no user visible feature or syntax change here. We're just trying to provide better performance for certain common SQL idioms. Good, it makes a lot of sense. What's not done: nodeMergejoin.c doesn't yet handle JOIN_ANTI. (This is just a SMOP, I guess that means you plan to do it once there has been significant testing with nestloop and hash and when the selectivity stuff is done? Actually, I got it done an hour or so ago --- it turned out to be easier than I thought. It just didn't seem like part of the critical path for the patch, so I'd been willing to let it go till later. I love it when things work that way. :-) Best, David -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers