Thank for looking at this again. On 25 January 2017 at 06:27, Tom Lane <t...@sss.pgh.pa.us> wrote: > David Rowley <david.row...@2ndquadrant.com> writes: >> However, having said that, I'm not sure why we'd need outer_unique >> available so we'd know that we could skip mark/restore. I think >> inner_unique is enough for this purpose. Take the comment from >> nodeMergejoin.c: > >> * outer: (0 ^1 1 2 5 5 5 6 6 7) current tuple: 1 >> * inner: (1 ^3 5 5 5 5 6) current tuple: 3 >> ... >> * >> * Consider the above relations and suppose that the executor has >> * just joined the first outer "5" with the last inner "5". The >> * next step is of course to join the second outer "5" with all >> * the inner "5's". This requires repositioning the inner "cursor" >> * to point at the first inner "5". This is done by "marking" the >> * first inner 5 so we can restore the "cursor" to it before joining >> * with the second outer 5. The access method interface provides >> * routines to mark and restore to a tuple. > >> If only one inner "5" can exist (unique_inner), then isn't the inner >> side already in the correct place, and no restore is required? > > Hmm ... let me see whether I have my head wrapped around this correctly. > > What I had in mind is a different optimization. When the inner is not > unique, we can't know we're done joining the first outer "5" until we > reach the inner "6". At that point, what we normally do is advance the > outer by one row and back up the inner to the mark point (the first inner > "5"). But the only reason to back up is that the new outer row might join > to the same inner rows the previous one did. If we know the outer side is > unique, then those inner rows can't join to the new outer row so no need > to back up. So this requires no real change to the mergejoin algorithm, > we just skip the mark and restore steps.
I'd not thought of that possibility, although with a unique outer, I don't think it'll ever save a restore, as the comparison to the new outer tuple will always fail to match the marked inner tuple. Never-the-less we can save that useless comparison, so I've written code to that affect in the attached. Perhaps detecting unique outer is not worthwhile for just this? > > I think what you're saying is that, if we know the inner side is unique, > we can also skip mark/restore overhead; but it would work a bit > differently. After joining two rows with equal keys, instead of advancing > the inner as per standard algorithm, we'd need to advance the outer side. > (This is OK because we know the next inner can't join to this same outer. Yeah, this is the same optimisation as applies to the other join types too which happens to just be the same point as semi joins must give up too. > But we don't know if the next outer can join to this inner.) We advance > inner only when current inner < current outer, so we're done with that > inner and need never rewind. So this is a more fundamental algorithm > change but it gets the same performance benefit. Yes, I think if inner is unique and outer is unique then after joining in EXEC_MJ_JOINTUPLES, we can have a new state EXEC_MJ_NEXTINNERANDOUTER, as the next outer won't match this inner, so we might as well skip to the next on both, but quite possibly such a join is just not common enough to have to worry about that. It might not give us much better performance anyway. > So the question is, if we can skip mark/restore overhead when we know that > either input is unique, is it necessary for the planner to account for > both ways explicitly? Given the general symmetry of mergejoins, it might > be okay for the planner to preferentially generate plans with the unique > input on the inside, and not worry about optimizing in this way when it's > on the outside. I'm inclined to agree. > Now, JOIN_SEMI and JOIN_ANTI cases are *not* symmetric, since we don't > implement reverse-semi or reverse-anti join modes. But I don't know that > there's anything to be won by noticing that the outer side is unique in > those cases. I think probably we can apply the same technique of > advancing outer not inner, and never rewinding, in those join modes > whether or not either input is known unique. > > Another asymmetry is that if one input is likely to be empty, it's better > to put that one on the outside, because if it is empty we don't have to > fetch anything from the other input (thus saving its startup cost). But > the planner doesn't account for that anyway because it never believes > non-dummy relations are truly empty; so even if it's getting this right > today, it's purely by chance. > > I can't think of any other asymmetries offhand. > > In short, I think you are right that it might be enough to account for > inner uniqueness only, and not worry about it for the outer side, even > for mergejoin. This means my previous objection is wrong and we don't > really have to allow for a future extension of that kind while choosing > the notation the planner uses. > > So ... would you rather go back to the previous notation (extra > JoinTypes), or do you like the separate boolean better anyway? I didn't really ever care for adding the new join types. I think if someone writes SELECT ... FROM ... INNER JOIN ... and sees a "Semi Join" in the plan, then we'll be bombarded with bug reports. Also if we ever changed our minds about outer unique, then we'd not be in a very great position to go and add it again. > Sorry for jerking you back and forth like this, but sometimes the > correct path isn't apparent from the start. It's fine. I learned quite a bit today thinking about all this again. The attached has my Merge Join changes, to show what I think can be done to make use of unique outer. Let me know what you think, but I get that idea that we're both leaning towards ripping the outer unique stuff out, so I'll go do that now... -- David Rowley http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Description: Binary data
-- Sent via pgsql-hackers mailing list (email@example.com) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers