On 2/5/25 09:27, Richard Guo wrote: > On Wed, Feb 5, 2025 at 5:55 AM Tom Lane <t...@sss.pgh.pa.us> wrote: >> Right now, if we have four tables to join, we have a joinlist >> (A B C D). (Really they're integer relids, but let's use names here.) >> If we decide to force C to be joined last, it should be sufficient to >> convert this to ((A B D) C). If C and D both look like candidates for >> this treatment, we can make it be (((A B) C) D) or (((A B) D) C). >> This is pretty much the same thing that happens if you set >> join_collapse_limit to 1 and use JOIN syntax to force a join order. >> In fact, IIRC we start out with nested joinlists and there is some >> code that normally flattens them until it decides it'd be creating >> too large a sub-problem. I'm suggesting selectively reversing the >> flattening. > > This should not be too difficult to implement. Outer joins seem to > add some complexity, though. We need to ensure that the resulting > joins in each sub-list are legal given the query's join order > constraints. For example, if we make the joinlist be (((A B) C) D), > we need to ensure that the A/B join and the (A/B)/C join are legal. >
I've not done anything like this with joins before, but I AFAICS the interesting stuff happens in deconstruct_recurse(), especially close to the end where we check join_collapse_limit and do joinlist = list_make2(leftpart, rightpart); So I guess one way to "reverse the flattening" could be to do something in deconstruct_recourse(). But I don't think that'd work all that well, because of the recursion. We don't want to add a "pipeline break" into the join list, we want to move the "dimension" to the end - even if only within the group defined by join_collapse_limit. E.g. imagine we have a join of 8 relations, with F (fact), dimensions D1 and D2, and then some artibrary tables T1, T2, T3, T4, T5. And let's say deconstruct_recurse() sees them in this particular order [F, T1, T2, D1, D2, T3, T4, T5] AFAICS doing something in deconstruct_recurse() would likely split the joinlist into four parts [F, T1, T2] [D1] [D2] [T3, T4, T5] which does treat the D1,D2 as if join_collapse_limit=1, but it also splits the remaining relations into two groups, when we'd probably want something more like this: [F, T1, T2, T3, T4, T5] [D1] [D2] Which should be legal, because a requirement is that D1/D2 don't have any other join restrictions (I guess this could be relaxed a bit to only restrictions within that particular group). Which leads me to the conclusion that the best place to do this kind of stuff is deconstruct_jointree(), once we have the "complete" joinlist. We could walk it and reorder/split some of the joinlists again. regards -- Tomas Vondra