Robert Haas <robertmh...@gmail.com> writes: > On Sat, Jan 6, 2024 at 4:08 PM Tom Lane <t...@sss.pgh.pa.us> wrote: >> The argument for the patch as proposed is that we should make the >> mergejoin and hashjoin code paths do what the nestloop path is doing. >> However, as I replied further down in that other thread, I'm not >> exactly convinced that the nestloop path is doing the right thing. >> I've not had time to look closer at that, though.
> I don't really understand what you were saying in your response there, > or what you're saying here. It seems to me that if the path is > parameterized by top relids, and the assertion is verifying that a > certain set of non-toprelids i.e. otherrels isn't present, then > obviously the assertion is going to pass, but it's not checking what > it purports to be checking. I think we're talking at cross-purposes. What I was wondering about (at least further down in the thread) was whether we shouldn't be checking *both* the "real" and the "parent" relids to make sure they don't overlap the parameterization sets. But it's probably overkill. After reading the code again I agree that the parent relids are more useful to check. However, I still don't like Richard's patch too much as-is, because the Asserts are difficult to read/understand and even more difficult to compare to the other code path. I think we want to keep the nestloop and not-nestloop paths as visually similar as possible, so I propose we do it more like the attached (where I also borrowed some of your wording for the comments). A variant of this could be to ifdef out all the added code in non-Assert builds: Relids outer_paramrels = PATH_REQ_OUTER(outer_path); Relids inner_paramrels = PATH_REQ_OUTER(inner_path); Relids required_outer; #ifdef USE_ASSERT_CHECKING Relids innerrelids; Relids outerrelids; /* * Any parameterization of the input paths refers to topmost parents of * the relevant relations, because reparameterize_path_by_child() hasn't * been called yet. So we must consider topmost parents of the relations * being joined, too, while checking for disallowed parameterization * cases. */ if (inner_path->parent->top_parent_relids) innerrelids = inner_path->parent->top_parent_relids; else innerrelids = inner_path->parent->relids; if (outer_path->parent->top_parent_relids) outerrelids = outer_path->parent->top_parent_relids; else outerrelids = outer_path->parent->relids; /* neither path can require rels from the other */ Assert(!bms_overlap(outer_paramrels, innerrelids)); Assert(!bms_overlap(inner_paramrels, outerrelids)); #endif /* form the union ... */ but I'm not sure that's better. Probably any reasonable compiler will throw away the whole calculation upon noticing the outputs are unused. regards, tom lane
diff --git a/src/backend/optimizer/path/joinpath.c b/src/backend/optimizer/path/joinpath.c index 3fd5a24fad..e9def9d540 100644 --- a/src/backend/optimizer/path/joinpath.c +++ b/src/backend/optimizer/path/joinpath.c @@ -730,8 +730,11 @@ try_nestloop_path(PlannerInfo *root, return; /* - * Paths are parameterized by top-level parents, so run parameterization - * tests on the parent relids. + * Any parameterization of the input paths refers to topmost parents of + * the relevant relations, because reparameterize_path_by_child() hasn't + * been called yet. So we must consider topmost parents of the relations + * being joined, too, while determining parameterization of the result and + * checking for disallowed parameterization cases. */ if (innerrel->top_parent_relids) innerrelids = innerrel->top_parent_relids; diff --git a/src/backend/optimizer/util/pathnode.c b/src/backend/optimizer/util/pathnode.c index f7ac71087a..8dbf790e89 100644 --- a/src/backend/optimizer/util/pathnode.c +++ b/src/backend/optimizer/util/pathnode.c @@ -2360,6 +2360,9 @@ create_foreign_upper_path(PlannerInfo *root, RelOptInfo *rel, * calc_nestloop_required_outer * Compute the required_outer set for a nestloop join path * + * Note: when considering a child join, the inputs nonetheless use top-level + * parent relids + * * Note: result must not share storage with either input */ Relids @@ -2394,11 +2397,30 @@ calc_non_nestloop_required_outer(Path *outer_path, Path *inner_path) { Relids outer_paramrels = PATH_REQ_OUTER(outer_path); Relids inner_paramrels = PATH_REQ_OUTER(inner_path); + Relids innerrelids PG_USED_FOR_ASSERTS_ONLY; + Relids outerrelids PG_USED_FOR_ASSERTS_ONLY; Relids required_outer; + /* + * Any parameterization of the input paths refers to topmost parents of + * the relevant relations, because reparameterize_path_by_child() hasn't + * been called yet. So we must consider topmost parents of the relations + * being joined, too, while checking for disallowed parameterization + * cases. + */ + if (inner_path->parent->top_parent_relids) + innerrelids = inner_path->parent->top_parent_relids; + else + innerrelids = inner_path->parent->relids; + + if (outer_path->parent->top_parent_relids) + outerrelids = outer_path->parent->top_parent_relids; + else + outerrelids = outer_path->parent->relids; + /* neither path can require rels from the other */ - Assert(!bms_overlap(outer_paramrels, inner_path->parent->relids)); - Assert(!bms_overlap(inner_paramrels, outer_path->parent->relids)); + Assert(!bms_overlap(outer_paramrels, innerrelids)); + Assert(!bms_overlap(inner_paramrels, outerrelids)); /* form the union ... */ required_outer = bms_union(outer_paramrels, inner_paramrels); /* we do not need an explicit test for empty; bms_union gets it right */