Hi hackers.

When a star(*) expands into multiple fields, our current
implementation is to generate multiple copies of the expression
and do FieldSelects. This is very inefficient because the same
expression get evaluated multiple times but actually we only need
to do it once. This is stated in ExpandRowReference().

For example:
CREATE TABLE tbl(c1 int, c2 int, c3 int);
CREATE TABLE src(v text);
CREATE FUNCTION expensive_func(input text) RETURNS t;
INSERT INTO tbl SELECT (expensive_func(v)).* FROM src;

This is effectively the same as:
INSERT INTO tbl SELECT (expensive_func(v)).c1,
(expensive_func(v)).c2, (expensive_func(v)).c3 FROM src;

In this form, expensive_func will be evaluated for every column in
tbl. If tbl has hundreds of columns we are in trouble. To partially
solve this issue, when doing projection in ExecBuildProjectionInfo,
instead of generating normal steps for FieldSelects one by one, we
can group them by the expression(arg of FieldSelect node). Then
evaluate the epxression once to get a HeapTuple, deform it into
fields, and then assign needed fields in one step. I've attached
patch that introduce EEOP_FIELD_MULTI_SELECT_ASSIGN for this.

With this patch, the following query should generate only one
NOTICE, instead of 3.

CREATE TYPE proj_type AS (a int, b int, c text);
CREATE OR REPLACE FUNCTION proj_type_func1(input text)
RETURNS proj_type AS $$
BEGIN
    RAISE NOTICE 'proj_type_func called';
    RETURN ROW(1, 2, input);
END
$$ IMMUTABLE LANGUAGE PLPGSQL;
CREATE TEMP TABLE stage_table(a text);
INSERT INTO stage_table VALUES('aaaa');
SELECT (proj_type_func1(a)).* FROM stage_table;

This patch is just proof of concept. Some unsolved questions I
can think of right now:
- Carry some information in FieldSelect from ExpandRowReference
to assist grouping?
- This can only handle FuncExpr as the root node of FieldSelect
arg. What about a more general expression?
- How to determine whether a common expression is safe to be
optimized this way? Any unexpcted side-effects?

Any thoughts on this approach?

Best regards,
Peifeng Qiu
From 52b389466cf397c6e70168a64252abf4c7453ba6 Mon Sep 17 00:00:00 2001
From: Peifeng Qiu <pg...@qiupf.dev>
Date: Fri, 2 Dec 2022 16:37:03 +0900
Subject: [PATCH] Optimize common expressions in projection evaluation

When a star(*) expands into multiple fields, our current
implementation is to generate multiple copies of the expression
and do FieldSelects. This is very inefficient because the same
expression get evaluated multiple times but actually we only need
to do it once. This is stated in ExpandRowReference().

For example:
CREATE TABLE tbl(c1 int, c2 int, c3 int);
CREATE TABLE src(v text);
CREATE FUNCTION expensive_func(input text) RETURNS t;
INSERT INTO tbl SELECT (expensive_func(v)).* FROM src;

This is effectively the same as:
INSERT INTO tbl SELECT expensive_func(v).c1, expensive_func(v).c2,
expensive_func(v).c3 FROM src;

In this form, expensive_func will be evaluated for every column in
tbl. If tbl has hundreds of columns we are in trouble. To solve
this, when doing projection we first collect all FieldSelect nodes
and group them by arg(expensive_func(v) in the above example). Then
evaluate arg once, extract and assign all needed fields.
EEOP_FIELD_MULTI_SELECT_ASSIGN is introduced to do this.
---
 src/backend/executor/execExpr.c               | 172 ++++++++++++++++++
 src/backend/executor/execExprInterp.c         |  86 +++++++++
 src/include/executor/execExpr.h               |  12 ++
 .../regress/expected/create_function_sql.out  |  39 +++-
 src/test/regress/sql/create_function_sql.sql  |  23 +++
 5 files changed, 331 insertions(+), 1 deletion(-)

diff --git a/src/backend/executor/execExpr.c b/src/backend/executor/execExpr.c
index 81429b9f05..ef54f0027e 100644
--- a/src/backend/executor/execExpr.c
+++ b/src/backend/executor/execExpr.c
@@ -331,6 +331,160 @@ ExecInitExprList(List *nodes, PlanState *parent)
 	return result;
 }
 
+/*
+ * Special optimization for FieldSelect.
+ *
+ * When a star(*) expands into multiple fields, there's an
+ * oppurtunity to group fields with common expression to
+ * prevent the expression from being evaluated multiple
+ * times. This partially addresses the problem stated in
+ * ExpandRowReference().
+ *
+ * For example:
+ * CREATE TABLE tbl(c1 int, c2 int, c3 int...);
+ * CREATE TABLE src(v text);
+ * CREATE FUNCTION func(input text) RETURNS t;
+ * INSERT INTO tbl SELECT (func(v)).* FROM src;
+ *
+ * This will be expanded into the following SQL:
+ * INSERT INTO tbl SELECT func(v).c1, func(v).c2... FROM src
+ *
+ * In this form, func will be evaluated for every column in
+ * tbl. If func is IMMUTABLE this is a huge waste. We want
+ * to evaluate func once and then extract all the columns
+ * at the same time.
+ *
+ * When doing projection, we first collect all FieldSelect
+ * nodes and group them by arg(func(v) in the example). Then
+ * evaluate arg once and assign all needed fields.
+ * EEOP_FIELD_MULTI_SELECT_ASSIGN is introduced to do this.
+ */
+typedef struct ExpressionGroup {
+	Expr *expr;
+	List *fields;
+} ExpressionGroup;
+
+/*
+ * Check whether an expression is safe to group. Currently
+ * only allow immutable functions. We may later expand this
+ * to all types of expressions and then we need a walker.
+ */
+static bool
+ExpressionSafeToGroup(Expr *expr)
+{
+	FuncExpr *func;
+	if (!IsA(expr, FuncExpr))
+		return false;
+
+	func = (FuncExpr *)expr;
+	if (func_volatile(func->funcid) == PROVOLATILE_IMMUTABLE)
+		return true;
+	return false;
+}
+
+static void
+ExecBuildProjectionFieldSelects(List *fieldSelects, ExprState  *state)
+{
+	ExpressionGroup *groups;
+	ExprEvalStep scratch = {0};
+	ListCell *lc;
+	int n = list_length(fieldSelects);
+	if (n == 0)
+		return;
+
+	/*
+	 * At most n groups if all expressions are different from
+	 * each other.
+	 */
+	groups = palloc0(sizeof(ExpressionGroup) * n);
+
+	/*
+	 * In the above example, FieldSelect->arg is "func".
+	 * We group all FieldSelects by their arg. Each group
+	 * belongs to the same "func" and thus the expression
+	 * can be evaluated only once.
+	 *
+	 * This is a naive group by. If we have lots of fields
+	 * and groups this can be slow.
+	 */
+	foreach(lc, fieldSelects)
+	{
+		TargetEntry *tle = lfirst_node(TargetEntry, lc);
+		FieldSelect *fs = (FieldSelect*)tle->expr;
+		for (int i = 0;;i++)
+		{
+			if (groups[i].fields == NIL)
+			{
+				/* New group. */
+				groups[i].expr = fs->arg;
+				groups[i].fields = list_make1(tle);
+				break;
+			}
+			else
+			{
+				/* Existing group, check if this expr belongs to it. */
+				if (equal(fs->arg, groups[i].expr))
+				{
+					groups[i].fields = lappend(groups[i].fields, tle);
+					break;
+				}
+			}
+		}
+	}
+
+	/*
+	 * For each group, generate op to evaluate the expression first
+	 * and the generate EEOP_FIELD_MULTI_SELECT_ASSIGN to extract
+	 * all fields at once.
+	 */
+	for (int i = 0; i < n && groups[i].expr; i++)
+	{
+		int nfield = list_length(groups[i].fields);
+
+		/* How to extract the field from Tuple produced by the expr. */
+		FieldSelect *fields = palloc(sizeof(FieldSelect) * nfield);
+
+		/* Whether the field is readonly. */
+		bool *ro = palloc(sizeof(bool) * nfield);
+
+		/* Which result column should the field be assign to? */
+		int *resultnum = palloc(sizeof(int) * nfield);
+
+		/* There might be jun fields, keep track of real fields. */
+		int realFields = 0;
+
+		/* Evaluate the expression first. */
+		ExecInitExprRec(groups[i].expr, state,
+				&state->resvalue, &state->resnull);
+
+		foreach(lc, groups[i].fields)
+		{
+			TargetEntry *tle = lfirst_node(TargetEntry, lc);
+			FieldSelect *fs = (FieldSelect*)tle->expr;
+
+			/*
+			 * Column might be referenced multiple times in upper nodes, so
+			 * force value to R/O - but only if it could be an expanded datum.
+			 */
+			if (get_typlen(exprType((Node *) tle->expr)) == -1)
+				ro[realFields] = true;
+			else
+				ro[realFields] = false;
+
+			memcpy(fields + realFields, fs, sizeof(FieldSelect));
+			resultnum[realFields] = tle->resno - 1;
+			realFields++;
+		}
+
+		scratch.opcode = EEOP_FIELD_MULTI_SELECT_ASSIGN;
+		scratch.d.field_multi_select_assign.nfields = realFields;
+		scratch.d.field_multi_select_assign.fields = fields;
+		scratch.d.field_multi_select_assign.ro = ro;
+		scratch.d.field_multi_select_assign.resultnum = resultnum;
+		ExprEvalPushStep(state, &scratch);
+	}
+}
+
 /*
  *		ExecBuildProjectionInfo
  *
@@ -361,6 +515,8 @@ ExecBuildProjectionInfo(List *targetList,
 	ExprState  *state;
 	ExprEvalStep scratch = {0};
 	ListCell   *lc;
+	/* Keep track of FieldSelect nodes so we can group them. */
+	List *fieldSelects = NIL;
 
 	projInfo->pi_exprContext = econtext;
 	/* We embed ExprState into ProjectionInfo instead of doing extra palloc */
@@ -446,6 +602,20 @@ ExecBuildProjectionInfo(List *targetList,
 		}
 		else
 		{
+			/*
+			 * FieldSelect can be optimized if multiple fields comes from
+			 * the same non-trivial expression. Just save it for now.
+			 */
+			if (IsA(tle->expr, FieldSelect))
+			{
+				FieldSelect *fs = (FieldSelect *)(tle->expr);
+				if (ExpressionSafeToGroup(fs->arg))
+				{
+					fieldSelects = lappend(fieldSelects, tle);
+					continue;
+				}
+			}
+
 			/*
 			 * Otherwise, compile the column expression normally.
 			 *
@@ -470,6 +640,8 @@ ExecBuildProjectionInfo(List *targetList,
 		}
 	}
 
+	ExecBuildProjectionFieldSelects(fieldSelects, state);
+
 	scratch.opcode = EEOP_DONE;
 	ExprEvalPushStep(state, &scratch);
 
diff --git a/src/backend/executor/execExprInterp.c b/src/backend/executor/execExprInterp.c
index 1dab2787b7..2d98642858 100644
--- a/src/backend/executor/execExprInterp.c
+++ b/src/backend/executor/execExprInterp.c
@@ -461,6 +461,7 @@ ExecInterpExpr(ExprState *state, ExprContext *econtext, bool *isnull)
 		&&CASE_EEOP_ROWCOMPARE_FINAL,
 		&&CASE_EEOP_MINMAX,
 		&&CASE_EEOP_FIELDSELECT,
+		&&CASE_EEOP_FIELD_MULTI_SELECT_ASSIGN,
 		&&CASE_EEOP_FIELDSTORE_DEFORM,
 		&&CASE_EEOP_FIELDSTORE_FORM,
 		&&CASE_EEOP_SBSREF_SUBSCRIPTS,
@@ -1423,6 +1424,14 @@ ExecInterpExpr(ExprState *state, ExprContext *econtext, bool *isnull)
 			EEO_NEXT();
 		}
 
+		EEO_CASE(EEOP_FIELD_MULTI_SELECT_ASSIGN)
+		{
+			/* deform the tuple first, then assign */
+			ExecEvalFieldMultiSelectAssign(state, op, econtext);
+
+			EEO_NEXT();
+		}
+
 		EEO_CASE(EEOP_FIELDSTORE_DEFORM)
 		{
 			/* too complex for an inline implementation */
@@ -3064,6 +3073,83 @@ ExecEvalFieldSelect(ExprState *state, ExprEvalStep *op, ExprContext *econtext)
 	}
 }
 
+/*
+ * Evaluate a group of FieldSelect nodes with common arg.
+ *
+ * This is a special optimization for multiple FieldSelect of the same
+ * record generated by a potentially expensive expression. Source record
+ * is in step's result variable.
+ */
+void
+ExecEvalFieldMultiSelectAssign(ExprState *state, ExprEvalStep *op, ExprContext *econtext)
+{
+	TupleTableSlot *resultslot = state->resultslot;
+	Oid			tupType = 0;
+	int32		tupTypmod = 0;
+	TupleDesc	tupDesc = NULL;
+	Datum		tupDatum = state->resvalue;
+
+	HeapTupleHeader tuphdr;
+	HeapTupleData tmptup;
+	ExprEvalRowtypeCache rowcache = {0};
+
+	int nfields = op->d.field_multi_select_assign.nfields;
+	FieldSelect *fields = op->d.field_multi_select_assign.fields;
+	int *resultnums = op->d.field_multi_select_assign.resultnum;
+	Datum *values;
+	bool *nulls;
+
+	if (state->resnull)
+	{
+		/* Expression returns NULL, just fill all fields as NULL. */
+		for (int i = 0; i < nfields; i++)
+		{
+			int resultnum = resultnums[i];
+			resultslot->tts_values[resultnum] = 0;
+			resultslot->tts_isnull[resultnum] = true;
+		}
+		return;
+	}
+
+	/* First, we need to deform the tuple into fields. */
+	tuphdr = DatumGetHeapTupleHeader(tupDatum);
+	tupType = HeapTupleHeaderGetTypeId(tuphdr);
+	tupTypmod = HeapTupleHeaderGetTypMod(tuphdr);
+	tupDesc = get_cached_rowtype(tupType, tupTypmod,
+								&rowcache, NULL);
+
+	tmptup.t_len = HeapTupleHeaderGetDatumLength(tuphdr);
+	ItemPointerSetInvalid(&(tmptup.t_self));
+	tmptup.t_tableOid = InvalidOid;
+	tmptup.t_data = tuphdr;
+
+	values = palloc(tupDesc->natts * sizeof(Datum));
+	nulls = palloc(tupDesc->natts * sizeof(bool));
+	heap_deform_tuple(&tmptup, tupDesc, values, nulls);
+
+	/* Assign the fields from the tuple to result */
+	for (int i = 0; i < nfields; i++)
+	{
+		/* column index of result(to) */
+		int resultnum = resultnums[i];
+		/* column index of tuple(from) */
+		int fieldnum = fields[i].fieldnum - 1;
+
+		Datum resvalue = values[fieldnum];
+		resultslot->tts_isnull[resultnum] = nulls[i];
+		if (op->d.field_multi_select_assign.ro[i])
+		{
+			if (!resultslot->tts_isnull[resultnum])
+				resultslot->tts_values[resultnum] =
+					MakeExpandedObjectReadOnlyInternal(resvalue);
+			else
+				resultslot->tts_values[resultnum] = resvalue;
+		}
+		else
+			resultslot->tts_values[resultnum] = resvalue;
+	}
+}
+
 /*
  * Deform source tuple, filling in the step's values/nulls arrays, before
  * evaluating individual new values as part of a FieldStore expression.
diff --git a/src/include/executor/execExpr.h b/src/include/executor/execExpr.h
index 0557302b92..129e6f970b 100644
--- a/src/include/executor/execExpr.h
+++ b/src/include/executor/execExpr.h
@@ -190,6 +190,7 @@ typedef enum ExprEvalOp
 
 	/* evaluate FieldSelect expression */
 	EEOP_FIELDSELECT,
+	EEOP_FIELD_MULTI_SELECT_ASSIGN,
 
 	/*
 	 * Deform tuple before evaluating new values for individual fields in a
@@ -494,6 +495,15 @@ typedef struct ExprEvalStep
 			ExprEvalRowtypeCache rowcache;
 		}			fieldselect;
 
+		/* for EEOP_FIELD_MULTI_SELECT_ASSIGN */
+		struct
+		{
+			int 		nfields;
+			FieldSelect *fields;
+			int		   *resultnum;
+			bool	   *ro;
+		}			field_multi_select_assign;
+
 		/* for EEOP_FIELDSTORE_DEFORM / FIELDSTORE_FORM */
 		struct
 		{
@@ -747,6 +757,8 @@ extern void ExecEvalRow(ExprState *state, ExprEvalStep *op);
 extern void ExecEvalMinMax(ExprState *state, ExprEvalStep *op);
 extern void ExecEvalFieldSelect(ExprState *state, ExprEvalStep *op,
 								ExprContext *econtext);
+extern void ExecEvalFieldMultiSelectAssign(ExprState *state, ExprEvalStep *op,
+										   ExprContext *econtext);
 extern void ExecEvalFieldStoreDeForm(ExprState *state, ExprEvalStep *op,
 									 ExprContext *econtext);
 extern void ExecEvalFieldStoreForm(ExprState *state, ExprEvalStep *op,
diff --git a/src/test/regress/expected/create_function_sql.out b/src/test/regress/expected/create_function_sql.out
index 50aca5940f..9fe4b38339 100644
--- a/src/test/regress/expected/create_function_sql.out
+++ b/src/test/regress/expected/create_function_sql.out
@@ -706,9 +706,43 @@ LINE 2:     AS 'SELECT $2;';
 CREATE FUNCTION test1 (int) RETURNS int LANGUAGE SQL
     AS 'a', 'b';
 ERROR:  only one AS item needed for language "sql"
+-- Test common expr in projection optimization
+CREATE TYPE proj_type AS (a int, b int, c text);
+CREATE OR REPLACE FUNCTION proj_type_func1(input text) RETURNS proj_type AS $$
+BEGIN
+    RAISE NOTICE 'proj_type_func called';
+    RETURN ROW(1, 2, input);
+END
+$$ IMMUTABLE LANGUAGE PLPGSQL;
+CREATE OR REPLACE FUNCTION proj_type_func2(input text) RETURNS proj_type AS $$
+BEGIN
+    RAISE NOTICE 'proj_type_func called';
+    RETURN ROW(1, 2, input);
+END
+$$ VOLATILE LANGUAGE PLPGSQL;
+CREATE TEMP TABLE stage_table(a text);
+INSERT INTO stage_table VALUES('aaaa');
+-- Immutable function only called once
+SELECT (proj_type_func1(a)).* FROM stage_table;
+NOTICE:  proj_type_func called
+ a | b |  c   
+---+---+------
+ 1 | 2 | aaaa
+(1 row)
+
+-- Volatile function called 3 times
+SELECT (proj_type_func2(a)).* FROM stage_table;
+NOTICE:  proj_type_func called
+NOTICE:  proj_type_func called
+NOTICE:  proj_type_func called
+ a | b |  c   
+---+---+------
+ 1 | 2 | aaaa
+(1 row)
+
 -- Cleanup
 DROP SCHEMA temp_func_test CASCADE;
-NOTICE:  drop cascades to 30 other objects
+NOTICE:  drop cascades to 33 other objects
 DETAIL:  drop cascades to function functest_a_1(text,date)
 drop cascades to function functest_a_2(text[])
 drop cascades to function functest_a_3()
@@ -739,5 +773,8 @@ drop cascades to function voidtest3(integer)
 drop cascades to function voidtest4(integer)
 drop cascades to function voidtest5(integer)
 drop cascades to function double_append(anyarray,anyelement)
+drop cascades to type proj_type
+drop cascades to function proj_type_func1(text)
+drop cascades to function proj_type_func2(text)
 DROP USER regress_unpriv_user;
 RESET search_path;
diff --git a/src/test/regress/sql/create_function_sql.sql b/src/test/regress/sql/create_function_sql.sql
index 89e9af3a49..507ffc0d8b 100644
--- a/src/test/regress/sql/create_function_sql.sql
+++ b/src/test/regress/sql/create_function_sql.sql
@@ -415,6 +415,29 @@ CREATE FUNCTION test1 (int) RETURNS int LANGUAGE SQL
 CREATE FUNCTION test1 (int) RETURNS int LANGUAGE SQL
     AS 'a', 'b';
 
+-- Test common expr in projection optimization
+CREATE TYPE proj_type AS (a int, b int, c text);
+CREATE OR REPLACE FUNCTION proj_type_func1(input text) RETURNS proj_type AS $$
+BEGIN
+    RAISE NOTICE 'proj_type_func called';
+    RETURN ROW(1, 2, input);
+END
+$$ IMMUTABLE LANGUAGE PLPGSQL;
+CREATE OR REPLACE FUNCTION proj_type_func2(input text) RETURNS proj_type AS $$
+BEGIN
+    RAISE NOTICE 'proj_type_func called';
+    RETURN ROW(1, 2, input);
+END
+$$ VOLATILE LANGUAGE PLPGSQL;
+
+CREATE TEMP TABLE stage_table(a text);
+INSERT INTO stage_table VALUES('aaaa');
+
+-- Immutable function only called once
+SELECT (proj_type_func1(a)).* FROM stage_table;
+-- Volatile function called 3 times
+SELECT (proj_type_func2(a)).* FROM stage_table;
+
 -- Cleanup
 DROP SCHEMA temp_func_test CASCADE;
 DROP USER regress_unpriv_user;
-- 
2.38.1

Reply via email to