On Thu, Jul 2, 2026 at 2:18 AM Tomas Vondra <[email protected]> wrote:
> join hardness / DPccp > --------------------- > PoC patch series > ---------------- > > To show this is totally doable, I've implemented the attached WIP patch > series implementing this as a join_search_hook. Cool! > The primary goal is to allow calculating the hardness estimate, and > showing how it aligns with the "actual" hardness, i.e. the number of > make_join_rel calls. There is a (more) experimental part trying to then > split the join problems using the hardness, but that's very hacky (and > I'll get back to that in a bit). I just noticed join_ordering_upper_bound() is pretty brute-force. Going by the formula up top: > N! * C(N-1) = (2*N - 2)! / (N - 1)! ...isn't the right side just "for (k = n; k <= 2*n-2; k++) result *= k;" ? > If we set the budget to 10k, that'd cover all joins of 9 relations (so a > bit more than the current join_collapse_limit default. But it'd also > handle many of the larger joins - all the data points below the 10k > horizontal line. > > With a 100k budget, we'd handle all joins up to 11 relations, plus many > of the larger joins. If the budget is large enough, and if the large-join heuristic search is good enough, maybe we wouldn't need to consider splitting in the general case (i.e. no star patterns etc.)? 10k is already almost enough to handle a 12-way star join. -- John Naylor Amazon Web Services
