Robert Haas <robertmh...@gmail.com> writes:
> On Sat, Jan 6, 2024 at 4:08 PM Tom Lane <t...@sss.pgh.pa.us> wrote:
>> The argument for the patch as proposed is that we should make the
>> mergejoin and hashjoin code paths do what the nestloop path is doing.
>> However, as I replied further down in that other thread, I'm not
>> exactly convinced that the nestloop path is doing the right thing.
>> I've not had time to look closer at that, though.

> I don't really understand what you were saying in your response there,
> or what you're saying here. It seems to me that if the path is
> parameterized by top relids, and the assertion is verifying that a
> certain set of non-toprelids i.e. otherrels isn't present, then
> obviously the assertion is going to pass, but it's not checking what
> it purports to be checking.

I think we're talking at cross-purposes.  What I was wondering about
(at least further down in the thread) was whether we shouldn't be
checking *both* the "real" and the "parent" relids to make sure they
don't overlap the parameterization sets.  But it's probably overkill.
After reading the code again I agree that the parent relids are more
useful to check.

However, I still don't like Richard's patch too much as-is, because
the Asserts are difficult to read/understand and even more difficult
to compare to the other code path.  I think we want to keep the
nestloop and not-nestloop paths as visually similar as possible,
so I propose we do it more like the attached (where I also borrowed
some of your wording for the comments).

A variant of this could be to ifdef out all the added code in
non-Assert builds:

    Relids        outer_paramrels = PATH_REQ_OUTER(outer_path);
    Relids        inner_paramrels = PATH_REQ_OUTER(inner_path);
    Relids        required_outer;
#ifdef USE_ASSERT_CHECKING
    Relids        innerrelids;
    Relids        outerrelids;

    /*
     * Any parameterization of the input paths refers to topmost parents of
     * the relevant relations, because reparameterize_path_by_child() hasn't
     * been called yet.  So we must consider topmost parents of the relations
     * being joined, too, while checking for disallowed parameterization
     * cases.
     */
    if (inner_path->parent->top_parent_relids)
        innerrelids = inner_path->parent->top_parent_relids;
    else
        innerrelids = inner_path->parent->relids;

    if (outer_path->parent->top_parent_relids)
        outerrelids = outer_path->parent->top_parent_relids;
    else
        outerrelids = outer_path->parent->relids;

    /* neither path can require rels from the other */
    Assert(!bms_overlap(outer_paramrels, innerrelids));
    Assert(!bms_overlap(inner_paramrels, outerrelids));
#endif
    /* form the union ... */

but I'm not sure that's better.  Probably any reasonable compiler
will throw away the whole calculation upon noticing the outputs
are unused.

                        regards, tom lane

diff --git a/src/backend/optimizer/path/joinpath.c b/src/backend/optimizer/path/joinpath.c
index 3fd5a24fad..e9def9d540 100644
--- a/src/backend/optimizer/path/joinpath.c
+++ b/src/backend/optimizer/path/joinpath.c
@@ -730,8 +730,11 @@ try_nestloop_path(PlannerInfo *root,
 		return;
 
 	/*
-	 * Paths are parameterized by top-level parents, so run parameterization
-	 * tests on the parent relids.
+	 * Any parameterization of the input paths refers to topmost parents of
+	 * the relevant relations, because reparameterize_path_by_child() hasn't
+	 * been called yet.  So we must consider topmost parents of the relations
+	 * being joined, too, while determining parameterization of the result and
+	 * checking for disallowed parameterization cases.
 	 */
 	if (innerrel->top_parent_relids)
 		innerrelids = innerrel->top_parent_relids;
diff --git a/src/backend/optimizer/util/pathnode.c b/src/backend/optimizer/util/pathnode.c
index f7ac71087a..8dbf790e89 100644
--- a/src/backend/optimizer/util/pathnode.c
+++ b/src/backend/optimizer/util/pathnode.c
@@ -2360,6 +2360,9 @@ create_foreign_upper_path(PlannerInfo *root, RelOptInfo *rel,
  * calc_nestloop_required_outer
  *	  Compute the required_outer set for a nestloop join path
  *
+ * Note: when considering a child join, the inputs nonetheless use top-level
+ * parent relids
+ *
  * Note: result must not share storage with either input
  */
 Relids
@@ -2394,11 +2397,30 @@ calc_non_nestloop_required_outer(Path *outer_path, Path *inner_path)
 {
 	Relids		outer_paramrels = PATH_REQ_OUTER(outer_path);
 	Relids		inner_paramrels = PATH_REQ_OUTER(inner_path);
+	Relids		innerrelids PG_USED_FOR_ASSERTS_ONLY;
+	Relids		outerrelids PG_USED_FOR_ASSERTS_ONLY;
 	Relids		required_outer;
 
+	/*
+	 * Any parameterization of the input paths refers to topmost parents of
+	 * the relevant relations, because reparameterize_path_by_child() hasn't
+	 * been called yet.  So we must consider topmost parents of the relations
+	 * being joined, too, while checking for disallowed parameterization
+	 * cases.
+	 */
+	if (inner_path->parent->top_parent_relids)
+		innerrelids = inner_path->parent->top_parent_relids;
+	else
+		innerrelids = inner_path->parent->relids;
+
+	if (outer_path->parent->top_parent_relids)
+		outerrelids = outer_path->parent->top_parent_relids;
+	else
+		outerrelids = outer_path->parent->relids;
+
 	/* neither path can require rels from the other */
-	Assert(!bms_overlap(outer_paramrels, inner_path->parent->relids));
-	Assert(!bms_overlap(inner_paramrels, outer_path->parent->relids));
+	Assert(!bms_overlap(outer_paramrels, innerrelids));
+	Assert(!bms_overlap(inner_paramrels, outerrelids));
 	/* form the union ... */
 	required_outer = bms_union(outer_paramrels, inner_paramrels);
 	/* we do not need an explicit test for empty; bms_union gets it right */

Reply via email to