Re: [HACKERS] Improving planning of outer joins

2005-12-21 Thread Christopher Kings-Lynne

I'm not sure whether we'd need any additional planner knobs to control
this.  I think that the existing join_collapse_limit GUC variable should
continue to exist, but its effect on left/right joins will be the same as
for inner joins.  If anyone wants to force join order for outer joins more
than for inner joins, we'd need some other control setting, but I don't
currently see why that would be very useful.

Does this seem like a reasonable agenda, or am I thinking too small?

regards, tom lane

---(end of broadcast)---
TIP 6: explain analyze is your friend




---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [HACKERS] Improving planning of outer joins

2005-12-16 Thread Simon Riggs
On Wed, 2005-12-14 at 21:27 -0500, Tom Lane wrote:
 I've spent some time looking into how we can improve our planning of outer
 joins.  

Sounds good.

 I'm not sure whether we'd need any additional planner knobs to control
 this.  I think that the existing join_collapse_limit GUC variable should
 continue to exist, but its effect on left/right joins will be the same as
 for inner joins.  If anyone wants to force join order for outer joins more
 than for inner joins, we'd need some other control setting, but I don't
 currently see why that would be very useful.

Agreed. No additional GUCs required.

 Does this seem like a reasonable agenda

Yup - tis good.

Best Regards, Simon Riggs


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] Improving planning of outer joins

2005-12-16 Thread Simon Riggs
On Thu, 2005-12-15 at 09:25 -0500, Tom Lane wrote:

 I did find some papers that talked about ways to push joins up and down
 past aggregations and GROUP BY, but that's a problem for another day.

Yes, they seem like a good thing to get round to... the papers seem to
present them as fairly clear transformations. I'd be interested if you
get the chance to look at that.

Best Regards, Simon Riggs




---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] Improving planning of outer joins

2005-12-16 Thread Alvaro Herrera
Tom Lane wrote:
 I've spent some time looking into how we can improve our planning of outer
 joins.  The current planner code slavishly follows the syntactic join
 order, which can lead to quite bad plans.  The reason it does this is that
 in some cases altering the join order of outer joins can change the
 results.  However, there are many cases where the results would not be
 changed, and we really need to start taking advantage of those cases.

I wonder if the code is already able to transform right joins to left
joins, like
(A rightjoin B on (Pab)) = (B leftjoin A on (Pab))

I haven't looked at the code but I vaguely remember it is possible with
some strings attached, like not being able to use not-mergejoinable
conditions or something.  I imagine it shows up as a leftjoin node with
some flag set.

How does this affect this optimization?  Does this hold:

(A rightjoin B on (Pab)) innerjoin C on (Pbc)
 = (B leftjoin A on (Pab)) innerjoin C on (Pbc)
 = (B innerjoin C on (Pbc)) leftjoin A on (Pab)

?

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] Improving planning of outer joins

2005-12-16 Thread Tom Lane
Alvaro Herrera [EMAIL PROTECTED] writes:
 I wonder if the code is already able to transform right joins to left
 joins, like
   (A rightjoin B on (Pab)) = (B leftjoin A on (Pab))

Yeah, we already know that part.  It's a freebie --- I didn't even
bother mentioning rightjoin in my post, since it's equivalent to
leftjoin after swapping the inputs.

regards, tom lane

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] Improving planning of outer joins

2005-12-16 Thread Alvaro Herrera
Tom Lane wrote:
 Alvaro Herrera [EMAIL PROTECTED] writes:
  I wonder if the code is already able to transform right joins to left
  joins, like
  (A rightjoin B on (Pab)) = (B leftjoin A on (Pab))
 
 Yeah, we already know that part.  It's a freebie --- I didn't even
 bother mentioning rightjoin in my post, since it's equivalent to
 leftjoin after swapping the inputs.

Why the thing about the mergejoinable conditions then?  Is that even
true?

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] Improving planning of outer joins

2005-12-16 Thread Tom Lane
Alvaro Herrera [EMAIL PROTECTED] writes:
 Why the thing about the mergejoinable conditions then?  Is that even
 true?

There are some things that work off mergejoinable conditions, but the
equivalence of right and left join isn't one of them ...

regards, tom lane

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] Improving planning of outer joins

2005-12-15 Thread Greg Stark

Tom Lane [EMAIL PROTECTED] writes:

 There is some stuff in the literature about how to make transformations
 of the last kind, but it requires additional executor smarts to do strange
 sorts of generalized outer join operations.  

Would these generalized outer join operations be general enough to handle IN
semantics? Or other subqueries?


-- 
greg


---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] Improving planning of outer joins

2005-12-15 Thread Tom Lane
Greg Stark [EMAIL PROTECTED] writes:
 Tom Lane [EMAIL PROTECTED] writes:
 There is some stuff in the literature about how to make transformations
 of the last kind, but it requires additional executor smarts to do strange
 sorts of generalized outer join operations.  

 Would these generalized outer join operations be general enough to handle IN
 semantics? Or other subqueries?

No, AFAICT it's just a weird way of defining a join operation.

I did find some papers that talked about ways to push joins up and down
past aggregations and GROUP BY, but that's a problem for another day.

regards, tom lane

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] Improving planning of outer joins

2005-12-15 Thread Jim C. Nasby
On Thu, Dec 15, 2005 at 09:25:42AM -0500, Tom Lane wrote:
 Greg Stark [EMAIL PROTECTED] writes:
  Tom Lane [EMAIL PROTECTED] writes:
  There is some stuff in the literature about how to make transformations
  of the last kind, but it requires additional executor smarts to do strange
  sorts of generalized outer join operations.  
 
  Would these generalized outer join operations be general enough to handle 
  IN
  semantics? Or other subqueries?
 
 No, AFAICT it's just a weird way of defining a join operation.
 
 I did find some papers that talked about ways to push joins up and down
 past aggregations and GROUP BY, but that's a problem for another day.

Might be worth adding a TODO for that and including links to the papers.
There's enough people that seem to drop in with PhD thesis and what-not
pulled from the TODO that someone could end up doing this work for us.
-- 
Jim C. Nasby, Sr. Engineering Consultant  [EMAIL PROTECTED]
Pervasive Software  http://pervasive.comwork: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf   cell: 512-569-9461

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] Improving planning of outer joins

2005-12-15 Thread Christopher Kings-Lynne

I'm not sure whether we'd need any additional planner knobs to control
this.  I think that the existing join_collapse_limit GUC variable should
continue to exist, but its effect on left/right joins will be the same as
for inner joins.  If anyone wants to force join order for outer joins more
than for inner joins, we'd need some other control setting, but I don't
currently see why that would be very useful.

Does this seem like a reasonable agenda, or am I thinking too small?


I think it's fantastic - it's a big complaint about planning we see 
often in the IRC channel.


Chris


---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [HACKERS] Improving planning of outer joins

2005-12-15 Thread Christopher Kings-Lynne

I'm not sure whether we'd need any additional planner knobs to control
this.  I think that the existing join_collapse_limit GUC variable should
continue to exist, but its effect on left/right joins will be the same as
for inner joins.  If anyone wants to force join order for outer joins more
than for inner joins, we'd need some other control setting, but I don't
currently see why that would be very useful.

Does this seem like a reasonable agenda, or am I thinking too small?


Oh, and if you wanted to go bigger - the next planning issue people seem 
to have is with our union planning...


Chris


---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


[HACKERS] Improving planning of outer joins

2005-12-14 Thread Tom Lane
I've spent some time looking into how we can improve our planning of outer
joins.  The current planner code slavishly follows the syntactic join
order, which can lead to quite bad plans.  The reason it does this is that
in some cases altering the join order of outer joins can change the
results.  However, there are many cases where the results would not be
changed, and we really need to start taking advantage of those cases.

After some review of the literature, I've found out that these identities
always hold:
(A leftjoin B on (Pab)) innerjoin C on (Pac)
= (A innerjoin C on (Pac)) leftjoin B on (Pab)
(where Pac is a predicate not referencing B, else of course the
transformation is nonsensical)
(A leftjoin B on (Pab)) leftjoin C on (Pac)
= (A leftjoin C on (Pac)) leftjoin B on (Pab)
Also, 
(A leftjoin B on (Pab)) leftjoin C on (Pbc)
= A leftjoin (B leftjoin C on (Pbc)) on (Pab)
if predicate Pbc must fail for all-null B rows.

The case that does *not* work is moving an innerjoin into or out of the
nullable side of an outer join:
A leftjoin (B join C on (Pbc)) on (Pab)
!= (A leftjoin B on (Pab)) join C on (Pbc)

There is some stuff in the literature about how to make transformations
of the last kind, but it requires additional executor smarts to do strange
sorts of generalized outer join operations.  I'm disinclined to tackle
that for now, partly because it seems like a lot of work for uncertain
payback (the first three cases cover most of the practical examples I've
looked at), and partly because I'm not too sure about the patent situation
for these things.  The three identities shown above have been known since
the late '80s and should be safe enough to work with.

What I'm thinking of doing is changing the planner so that a left or right
join node is broken apart into two separate relations (effectively
FROM-list items) that participate in the normal join-order search process,
subject to these limitations:
* A regular join within the nullable side (the right side of a left join,
  etc) remains indivisible and must be planned separately.  In other
  cases we recursively break down sub-joins into separate relations.
* The nullable side is marked to show that its first join must be to a set
  of relations including at least those relations on the outer side that
  are referenced in the outer-join condition.  (This compares to the
  current state of affairs, which is that its first join is always to
  exactly the relation(s) on the outer side.)  If the outer-join condition
  isn't strict for these outer-side relations, however, we have to punt
  and mark the nullable side as requiring all of the outer-side relations
  for its first join (this covers the case where the third identity above
  doesn't hold).

Full joins are not covered by this and will continue to be planned as now.

I'm not sure whether we'd need any additional planner knobs to control
this.  I think that the existing join_collapse_limit GUC variable should
continue to exist, but its effect on left/right joins will be the same as
for inner joins.  If anyone wants to force join order for outer joins more
than for inner joins, we'd need some other control setting, but I don't
currently see why that would be very useful.

Does this seem like a reasonable agenda, or am I thinking too small?

regards, tom lane

---(end of broadcast)---
TIP 6: explain analyze is your friend