On Fri, Mar 22, 2024 at 2:54 PM Amit Langote <amitlangot...@gmail.com>
wrote:

> >
> > Please let me know if you need anything.
>
> Thanks for repeating the benchmark.  So I don't see a problem with
> keeping the existing palloc() way of allocating the
> SpecialJoinInfos().  We're adding a few cycles by adding
> free_child_join_sjinfo() (see my delta patch about the rename), but
> the reduction in memory consumption due to it, which is our goal here,
> far outweighs what looks like a minor CPU slowdown.
>
> I have squashed 0002, 0003 into 0001, done some changes of my own, and
> attached the delta patch as 0002.  I've listed the changes in the
> commit message.  Let me know if anything seems amiss.
>
> Thanks Amit for your changes.

> * Revert (IMO) unnecessary modifications of build_child_join_sjinfo().
> For example, hunks to rename sjinfo to child_sjinfo seem unnecessary,
> because it's clear from the context that they are "child" sjinfos.
>

Ok.

> * Rename free_child_sjinfo_members() to free_child_join_sjinfo() to
> be symmetric with build_child_join_sjinfo().  Note that the function
> is charged with freeing the sjinfo itself too.  Like
> build_child_join_sjinfo(), name the argument sjinfo, not
> child_sjinfo.

Yes. It should have been done in my patch.

>
> * Don't set the child sjinfo pointer to NULL after freeing it.  While
> a good convention in general, it's not really common in the planner
> code, so doing it here seems a bit overkill

That function is the last line in the block where pointer variable is
declared.
As of now there is no danger of the dangling pointer being used.

>
> * Rename make_dummy_sjinfo() to init_dummy_sjinfo() because the
> function is not really making it per se, only initializing fields
> in the SpecialJoinInfo struct made available by the caller.

Right.

>
> * Make init_dummy_sjinfo() take Relids instead of RelOptInfos because
> that's what's needed.  Also allows to reduce changes to
> build_child_join_sjinfo()'s signature.

I remember, in one of the versions it was Relids and then I changed to
RelOptInfos for some reason. The latest version seems to use just Relids. So
this looks better.

>
> * Update the reason mentioned in a comment in free_child_join_sjinfo()
> about why semi_rhs_expr is not freed -- it may be freed, but there's
> no way to "deep" free it, so we leave it alone.
>
> * Update a comment somewhere about why SpecialJoinInfo can be
> "transient" sometimes.

With the later code, semi_rhs_expr is indeed free'able. It wasn't so when I
wrote the patches. Thanks for updating the comment. I wish we would have
freeObject(). Nothing that can be done for this patch.

Attached patches
Squashed your 0001 and 0002 into 0001. I have also reworded the commit
message to reflect the latest code changes.
0002 has some minor edits. Please feel free to include or reject when
committing the patch.

-- 
Best Wishes,
Ashutosh Bapat
From 555cb9825eaa784d23c80c28d2cee27570cad781 Mon Sep 17 00:00:00 2001
From: Ashutosh Bapat <ashutosh.ba...@enterprisedb.com>
Date: Fri, 22 Mar 2024 15:39:08 +0530
Subject: [PATCH 2/2] Minor edits

Ashutosh Bapat
---
 src/backend/optimizer/path/joinrels.c | 10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/src/backend/optimizer/path/joinrels.c b/src/backend/optimizer/path/joinrels.c
index bef5b5f213..5d055be855 100644
--- a/src/backend/optimizer/path/joinrels.c
+++ b/src/backend/optimizer/path/joinrels.c
@@ -659,10 +659,10 @@ join_is_legal(PlannerInfo *root, RelOptInfo *rel1, RelOptInfo *rel2,
  *    Populate the given SpecialJoinInfo for a plain inner join between rel1
  *    and rel2
  *
- * Inner joins do not normally have an associated SpecialJoinInfo node.  But
- * some functions involved in join planning require one to be passed,
- * containing at least the information of which relations are being joined,
- * so we initialize that information here.
+ * Normally, an inner join does not have a SpecialJoinInfo node associated with
+ * it. But some functions involved in join planning require one containing at
+ * least the information of which relations are being joined.  So we initialize
+ * that information here.
  */
 void
 init_dummy_sjinfo(SpecialJoinInfo *sjinfo, Relids left_relids,
@@ -1686,7 +1686,7 @@ try_partitionwise_join(PlannerInfo *root, RelOptInfo *rel1, RelOptInfo *rel2,
  * are the relids of left and right side of the join respectively.
  *
  * If translations are added to or removed from this function, consider
- * updating free_child_join_sjinfo() appropriately.
+ * updating free_child_join_sjinfo() accordingly.
  */
 static SpecialJoinInfo *
 build_child_join_sjinfo(PlannerInfo *root, SpecialJoinInfo *parent_sjinfo,
-- 
2.25.1

From 42c4eda04cbf7dc786f6ea6e7c39e23c3e3e1715 Mon Sep 17 00:00:00 2001
From: Amit Langote <amit...@postgresql.org>
Date: Fri, 22 Mar 2024 14:07:48 +0900
Subject: [PATCH 1/2] Reduce memory used by child SpecialJoinInfo

The SpecialJoinInfo applicable to a child join is computed by
translating the same applicable to the parent join in
try_partitionwise_join(). The child SpecialJoinInfo is not needed once
the child join RelOptInfo is created and paths are added to it. Free the
child SpecialJoinInfo and various Bitmapsets in it.  But we do not free
the semi_rhs_exprs for the lack of a handy function to free expression
trees.

Plain inner joins do not have SpecialJoinInfos associated with them.
They are crafted on the fly for parent joins. Do the same for child
joins as well.

Author: Ashutosh Bapat <ashutosh.bapat....@gmail.com>
Reviewed-by: Richard Guo <guofengli...@gmail.com>
Reviewed-by: Amit Langote <amitlangot...@gmail.com>
Reviewed-by: Andrey Lepikhov <a.lepik...@postgrespro.ru>
Reviewed-by: Tomas Vondra <tomas.von...@enterprisedb.com>
Discussion: https://postgr.es/m/caexhw5thqef3asvqvffcghygpfpy7o3xnvhhwbgbjfmrh8k...@mail.gmail.com
---
 src/backend/optimizer/path/costsize.c | 37 +----------
 src/backend/optimizer/path/joinrels.c | 94 ++++++++++++++++++++++-----
 src/backend/optimizer/util/pathnode.c |  5 +-
 src/include/nodes/pathnodes.h         |  4 ++
 src/include/optimizer/paths.h         |  2 +
 5 files changed, 89 insertions(+), 53 deletions(-)

diff --git a/src/backend/optimizer/path/costsize.c b/src/backend/optimizer/path/costsize.c
index 83a0aed051..a18f8f7130 100644
--- a/src/backend/optimizer/path/costsize.c
+++ b/src/backend/optimizer/path/costsize.c
@@ -5049,23 +5049,7 @@ compute_semi_anti_join_factors(PlannerInfo *root,
 	/*
 	 * Also get the normal inner-join selectivity of the join clauses.
 	 */
-	norm_sjinfo.type = T_SpecialJoinInfo;
-	norm_sjinfo.min_lefthand = outerrel->relids;
-	norm_sjinfo.min_righthand = innerrel->relids;
-	norm_sjinfo.syn_lefthand = outerrel->relids;
-	norm_sjinfo.syn_righthand = innerrel->relids;
-	norm_sjinfo.jointype = JOIN_INNER;
-	norm_sjinfo.ojrelid = 0;
-	norm_sjinfo.commute_above_l = NULL;
-	norm_sjinfo.commute_above_r = NULL;
-	norm_sjinfo.commute_below_l = NULL;
-	norm_sjinfo.commute_below_r = NULL;
-	/* we don't bother trying to make the remaining fields valid */
-	norm_sjinfo.lhs_strict = false;
-	norm_sjinfo.semi_can_btree = false;
-	norm_sjinfo.semi_can_hash = false;
-	norm_sjinfo.semi_operators = NIL;
-	norm_sjinfo.semi_rhs_exprs = NIL;
+	init_dummy_sjinfo(&norm_sjinfo, outerrel->relids, innerrel->relids);
 
 	nselec = clauselist_selectivity(root,
 									joinquals,
@@ -5218,23 +5202,8 @@ approx_tuple_count(PlannerInfo *root, JoinPath *path, List *quals)
 	/*
 	 * Make up a SpecialJoinInfo for JOIN_INNER semantics.
 	 */
-	sjinfo.type = T_SpecialJoinInfo;
-	sjinfo.min_lefthand = path->outerjoinpath->parent->relids;
-	sjinfo.min_righthand = path->innerjoinpath->parent->relids;
-	sjinfo.syn_lefthand = path->outerjoinpath->parent->relids;
-	sjinfo.syn_righthand = path->innerjoinpath->parent->relids;
-	sjinfo.jointype = JOIN_INNER;
-	sjinfo.ojrelid = 0;
-	sjinfo.commute_above_l = NULL;
-	sjinfo.commute_above_r = NULL;
-	sjinfo.commute_below_l = NULL;
-	sjinfo.commute_below_r = NULL;
-	/* we don't bother trying to make the remaining fields valid */
-	sjinfo.lhs_strict = false;
-	sjinfo.semi_can_btree = false;
-	sjinfo.semi_can_hash = false;
-	sjinfo.semi_operators = NIL;
-	sjinfo.semi_rhs_exprs = NIL;
+	init_dummy_sjinfo(&sjinfo, path->outerjoinpath->parent->relids,
+					  path->innerjoinpath->parent->relids);
 
 	/* Get the approximate selectivity */
 	foreach(l, quals)
diff --git a/src/backend/optimizer/path/joinrels.c b/src/backend/optimizer/path/joinrels.c
index 4750579b0a..bef5b5f213 100644
--- a/src/backend/optimizer/path/joinrels.c
+++ b/src/backend/optimizer/path/joinrels.c
@@ -45,6 +45,7 @@ static void try_partitionwise_join(PlannerInfo *root, RelOptInfo *rel1,
 static SpecialJoinInfo *build_child_join_sjinfo(PlannerInfo *root,
 												SpecialJoinInfo *parent_sjinfo,
 												Relids left_relids, Relids right_relids);
+static void free_child_join_sjinfo(SpecialJoinInfo *child_sjinfo);
 static void compute_partition_bounds(PlannerInfo *root, RelOptInfo *rel1,
 									 RelOptInfo *rel2, RelOptInfo *joinrel,
 									 SpecialJoinInfo *parent_sjinfo,
@@ -653,6 +654,38 @@ join_is_legal(PlannerInfo *root, RelOptInfo *rel1, RelOptInfo *rel2,
 	return true;
 }
 
+/*
+ * init_dummy_sjinfo
+ *    Populate the given SpecialJoinInfo for a plain inner join between rel1
+ *    and rel2
+ *
+ * Inner joins do not normally have an associated SpecialJoinInfo node.  But
+ * some functions involved in join planning require one to be passed,
+ * containing at least the information of which relations are being joined,
+ * so we initialize that information here.
+ */
+void
+init_dummy_sjinfo(SpecialJoinInfo *sjinfo, Relids left_relids,
+				  Relids right_relids)
+{
+	sjinfo->type = T_SpecialJoinInfo;
+	sjinfo->min_lefthand = left_relids;
+	sjinfo->min_righthand = right_relids;
+	sjinfo->syn_lefthand = left_relids;
+	sjinfo->syn_righthand = right_relids;
+	sjinfo->jointype = JOIN_INNER;
+	sjinfo->ojrelid = 0;
+	sjinfo->commute_above_l = NULL;
+	sjinfo->commute_above_r = NULL;
+	sjinfo->commute_below_l = NULL;
+	sjinfo->commute_below_r = NULL;
+	/* we don't bother trying to make the remaining fields valid */
+	sjinfo->lhs_strict = false;
+	sjinfo->semi_can_btree = false;
+	sjinfo->semi_can_hash = false;
+	sjinfo->semi_operators = NIL;
+	sjinfo->semi_rhs_exprs = NIL;
+}
 
 /*
  * make_join_rel
@@ -716,23 +749,7 @@ make_join_rel(PlannerInfo *root, RelOptInfo *rel1, RelOptInfo *rel2)
 	if (sjinfo == NULL)
 	{
 		sjinfo = &sjinfo_data;
-		sjinfo->type = T_SpecialJoinInfo;
-		sjinfo->min_lefthand = rel1->relids;
-		sjinfo->min_righthand = rel2->relids;
-		sjinfo->syn_lefthand = rel1->relids;
-		sjinfo->syn_righthand = rel2->relids;
-		sjinfo->jointype = JOIN_INNER;
-		sjinfo->ojrelid = 0;
-		sjinfo->commute_above_l = NULL;
-		sjinfo->commute_above_r = NULL;
-		sjinfo->commute_below_l = NULL;
-		sjinfo->commute_below_r = NULL;
-		/* we don't bother trying to make the remaining fields valid */
-		sjinfo->lhs_strict = false;
-		sjinfo->semi_can_btree = false;
-		sjinfo->semi_can_hash = false;
-		sjinfo->semi_operators = NIL;
-		sjinfo->semi_rhs_exprs = NIL;
+		init_dummy_sjinfo(sjinfo, rel1->relids, rel2->relids);
 	}
 
 	/*
@@ -1659,6 +1676,7 @@ try_partitionwise_join(PlannerInfo *root, RelOptInfo *rel1, RelOptInfo *rel2,
 									child_restrictlist);
 
 		pfree(appinfos);
+		free_child_join_sjinfo(child_sjinfo);
 	}
 }
 
@@ -1666,6 +1684,9 @@ try_partitionwise_join(PlannerInfo *root, RelOptInfo *rel1, RelOptInfo *rel2,
  * Construct the SpecialJoinInfo for a child-join by translating
  * SpecialJoinInfo for the join between parents. left_relids and right_relids
  * are the relids of left and right side of the join respectively.
+ *
+ * If translations are added to or removed from this function, consider
+ * updating free_child_join_sjinfo() appropriately.
  */
 static SpecialJoinInfo *
 build_child_join_sjinfo(PlannerInfo *root, SpecialJoinInfo *parent_sjinfo,
@@ -1677,6 +1698,14 @@ build_child_join_sjinfo(PlannerInfo *root, SpecialJoinInfo *parent_sjinfo,
 	AppendRelInfo **right_appinfos;
 	int			right_nappinfos;
 
+	/* Dummy SpecialJoinInfos can be created without any translation. */
+	if (parent_sjinfo->jointype == JOIN_INNER)
+	{
+		Assert(parent_sjinfo->ojrelid == 0);
+		init_dummy_sjinfo(sjinfo, left_relids, right_relids);
+		return sjinfo;
+	}
+
 	memcpy(sjinfo, parent_sjinfo, sizeof(SpecialJoinInfo));
 	left_appinfos = find_appinfos_by_relids(root, left_relids,
 											&left_nappinfos);
@@ -1705,6 +1734,37 @@ build_child_join_sjinfo(PlannerInfo *root, SpecialJoinInfo *parent_sjinfo,
 	return sjinfo;
 }
 
+/*
+ * free_child_join_sjinfo
+ *		Free memory consumed by a SpecialJoinInfo created by
+ *		build_child_join_sjinfo()
+ *
+ * Only members that are translated copies of their counterpart in the parent
+ * SpecialJoinInfo are freed here; see build_child_join_sjinfo().
+ */
+static void
+free_child_join_sjinfo(SpecialJoinInfo *sjinfo)
+{
+	/*
+	 * Dummy SpecialJoinInfos of inner joins do not have any translated fields
+	 * and hence have nothing to be freed; see init_dummy_sjinfo().
+	 */
+	if (sjinfo->jointype != JOIN_INNER)
+	{
+		bms_free(sjinfo->min_lefthand);
+		bms_free(sjinfo->min_righthand);
+		bms_free(sjinfo->syn_lefthand);
+		bms_free(sjinfo->syn_righthand);
+
+		/*
+		 * semi_rhs_exprs may in principle be freed, but a simple pfree() does
+		 * not suffice, so we leave it alone.
+		 */
+	}
+
+	pfree(sjinfo);
+}
+
 /*
  * compute_partition_bounds
  *		Compute the partition bounds for a join rel from those for inputs
diff --git a/src/backend/optimizer/util/pathnode.c b/src/backend/optimizer/util/pathnode.c
index 246cd8f747..8f5b0770b1 100644
--- a/src/backend/optimizer/util/pathnode.c
+++ b/src/backend/optimizer/util/pathnode.c
@@ -1705,8 +1705,9 @@ create_unique_path(PlannerInfo *root, RelOptInfo *rel, Path *subpath,
 	pathnode->subpath = subpath;
 
 	/*
-	 * Under GEQO, the sjinfo might be short-lived, so we'd better make copies
-	 * of data structures we extract from it.
+	 * Under GEQO and when planning child joins, the sjinfo might be
+	 * short-lived, so we'd better make copies of data structures we extract
+	 * from it.
 	 */
 	pathnode->in_operators = copyObject(sjinfo->semi_operators);
 	pathnode->uniq_exprs = copyObject(sjinfo->semi_rhs_exprs);
diff --git a/src/include/nodes/pathnodes.h b/src/include/nodes/pathnodes.h
index 534692bee1..7192ae5171 100644
--- a/src/include/nodes/pathnodes.h
+++ b/src/include/nodes/pathnodes.h
@@ -2854,6 +2854,10 @@ typedef struct PlaceHolderVar
  * cost estimation purposes it is sometimes useful to know the join size under
  * plain innerjoin semantics.  Note that lhs_strict and the semi_xxx fields
  * are not set meaningfully within such structs.
+ *
+ * We also create transient SpecialJoinInfos for child joins by translating
+ * corresponding parent SpecialJoinInfos. These are also not part of
+ * PlannerInfo's join_info_list.
  */
 #ifndef HAVE_SPECIALJOININFO_TYPEDEF
 typedef struct SpecialJoinInfo SpecialJoinInfo;
diff --git a/src/include/optimizer/paths.h b/src/include/optimizer/paths.h
index b137c8a589..125676b5bf 100644
--- a/src/include/optimizer/paths.h
+++ b/src/include/optimizer/paths.h
@@ -112,6 +112,8 @@ extern bool have_join_order_restriction(PlannerInfo *root,
 extern bool have_dangerous_phv(PlannerInfo *root,
 							   Relids outer_relids, Relids inner_params);
 extern void mark_dummy_rel(RelOptInfo *rel);
+extern void init_dummy_sjinfo(SpecialJoinInfo *sjinfo, Relids left_relids,
+							  Relids right_relids);
 
 /*
  * equivclass.c
-- 
2.25.1

Reply via email to