Robert Haas <[email protected]> writes:
> On Sat, Jan 6, 2024 at 4:08 PM Tom Lane <[email protected]> wrote:
>> The argument for the patch as proposed is that we should make the
>> mergejoin and hashjoin code paths do what the nestloop path is doing.
>> However, as I replied further down in that other thread, I'm not
>> exactly convinced that the nestloop path is doing the right thing.
>> I've not had time to look closer at that, though.
> I don't really understand what you were saying in your response there,
> or what you're saying here. It seems to me that if the path is
> parameterized by top relids, and the assertion is verifying that a
> certain set of non-toprelids i.e. otherrels isn't present, then
> obviously the assertion is going to pass, but it's not checking what
> it purports to be checking.
I think we're talking at cross-purposes. What I was wondering about
(at least further down in the thread) was whether we shouldn't be
checking *both* the "real" and the "parent" relids to make sure they
don't overlap the parameterization sets. But it's probably overkill.
After reading the code again I agree that the parent relids are more
useful to check.
However, I still don't like Richard's patch too much as-is, because
the Asserts are difficult to read/understand and even more difficult
to compare to the other code path. I think we want to keep the
nestloop and not-nestloop paths as visually similar as possible,
so I propose we do it more like the attached (where I also borrowed
some of your wording for the comments).
A variant of this could be to ifdef out all the added code in
non-Assert builds:
Relids outer_paramrels = PATH_REQ_OUTER(outer_path);
Relids inner_paramrels = PATH_REQ_OUTER(inner_path);
Relids required_outer;
#ifdef USE_ASSERT_CHECKING
Relids innerrelids;
Relids outerrelids;
/*
* Any parameterization of the input paths refers to topmost parents of
* the relevant relations, because reparameterize_path_by_child() hasn't
* been called yet. So we must consider topmost parents of the relations
* being joined, too, while checking for disallowed parameterization
* cases.
*/
if (inner_path->parent->top_parent_relids)
innerrelids = inner_path->parent->top_parent_relids;
else
innerrelids = inner_path->parent->relids;
if (outer_path->parent->top_parent_relids)
outerrelids = outer_path->parent->top_parent_relids;
else
outerrelids = outer_path->parent->relids;
/* neither path can require rels from the other */
Assert(!bms_overlap(outer_paramrels, innerrelids));
Assert(!bms_overlap(inner_paramrels, outerrelids));
#endif
/* form the union ... */
but I'm not sure that's better. Probably any reasonable compiler
will throw away the whole calculation upon noticing the outputs
are unused.
regards, tom lane
diff --git a/src/backend/optimizer/path/joinpath.c b/src/backend/optimizer/path/joinpath.c
index 3fd5a24fad..e9def9d540 100644
--- a/src/backend/optimizer/path/joinpath.c
+++ b/src/backend/optimizer/path/joinpath.c
@@ -730,8 +730,11 @@ try_nestloop_path(PlannerInfo *root,
return;
/*
- * Paths are parameterized by top-level parents, so run parameterization
- * tests on the parent relids.
+ * Any parameterization of the input paths refers to topmost parents of
+ * the relevant relations, because reparameterize_path_by_child() hasn't
+ * been called yet. So we must consider topmost parents of the relations
+ * being joined, too, while determining parameterization of the result and
+ * checking for disallowed parameterization cases.
*/
if (innerrel->top_parent_relids)
innerrelids = innerrel->top_parent_relids;
diff --git a/src/backend/optimizer/util/pathnode.c b/src/backend/optimizer/util/pathnode.c
index f7ac71087a..8dbf790e89 100644
--- a/src/backend/optimizer/util/pathnode.c
+++ b/src/backend/optimizer/util/pathnode.c
@@ -2360,6 +2360,9 @@ create_foreign_upper_path(PlannerInfo *root, RelOptInfo *rel,
* calc_nestloop_required_outer
* Compute the required_outer set for a nestloop join path
*
+ * Note: when considering a child join, the inputs nonetheless use top-level
+ * parent relids
+ *
* Note: result must not share storage with either input
*/
Relids
@@ -2394,11 +2397,30 @@ calc_non_nestloop_required_outer(Path *outer_path, Path *inner_path)
{
Relids outer_paramrels = PATH_REQ_OUTER(outer_path);
Relids inner_paramrels = PATH_REQ_OUTER(inner_path);
+ Relids innerrelids PG_USED_FOR_ASSERTS_ONLY;
+ Relids outerrelids PG_USED_FOR_ASSERTS_ONLY;
Relids required_outer;
+ /*
+ * Any parameterization of the input paths refers to topmost parents of
+ * the relevant relations, because reparameterize_path_by_child() hasn't
+ * been called yet. So we must consider topmost parents of the relations
+ * being joined, too, while checking for disallowed parameterization
+ * cases.
+ */
+ if (inner_path->parent->top_parent_relids)
+ innerrelids = inner_path->parent->top_parent_relids;
+ else
+ innerrelids = inner_path->parent->relids;
+
+ if (outer_path->parent->top_parent_relids)
+ outerrelids = outer_path->parent->top_parent_relids;
+ else
+ outerrelids = outer_path->parent->relids;
+
/* neither path can require rels from the other */
- Assert(!bms_overlap(outer_paramrels, inner_path->parent->relids));
- Assert(!bms_overlap(inner_paramrels, outer_path->parent->relids));
+ Assert(!bms_overlap(outer_paramrels, innerrelids));
+ Assert(!bms_overlap(inner_paramrels, outerrelids));
/* form the union ... */
required_outer = bms_union(outer_paramrels, inner_paramrels);
/* we do not need an explicit test for empty; bms_union gets it right */