Re: calling procedures is slow and consumes extra much memory against calling function

Pavel Stehule Thu, 16 Jul 2020 12:09:26 -0700

Hi

I am sending another patch that tries to allow CachedPlans for CALL
statements. I think this patch is very accurate, but it is not nice,
because it is smudging very precious reference counting for CachedPlans.


Current issue:
==========

I found a problem with repeated CALL statements from DO command. For every
execution of a CALL statement a plan is created that is released at the
time of the end of DO block.

create or replace procedure insert_into_foo(i int)
as $$
begin
  insert into foo values(i, i || 'Ahoj');
  if i % 1000 = 0 then raise notice '%', i;
    --commit;
  end if;
end;
$$ language plpgsql;

and DO

do $$
begin
  for i in 1..500000
  loop
    call insert_into_foo(i);
  end loop;
end
$$;

Requires about 2.5GB RAM (execution time is 18 sec). The problem is "long
transaction" with 500M iteration of CALL statement.

If I try to remove a comment before COMMIT - then I get 500 transactions.
But still it needs 2.5GB memory.

The reason for this behaviour is disabling plan cache for CALL statements
executed in atomic mode.

So I wrote patch 1, that releases the not saved plan immediately. This
patch is very simple, and fixes memory issues. It is a little bit faster
(14 sec), and Postgres consumes about 200KB.

Patch 1 is simple, clean, nice but execution of CALL statements is slow due
repeated planning.

I tried to fix this issue another way - by little bit different work with
plan cache reference counting. Current design expects only statements
wrapped inside transactions. It is not designed for new possibilities in
CALL statements, when more transactions can be finished inside one
statement. Principally - cached plans should not be reused in different
transactions (simple expressions are an exception). So if we try to use
cached plans for CALL statements, there is no clean responsibility who has
to close a cached plan. It can be SPI (when execution stays in the same
transaction), or resource owner (when transaction is finished inside
execution of SPI).

The Peter wrote a comment about it

<--><--><-->/*
<--><--><--> * Don't save the plan if not in atomic context.  Otherwise,
<--><--><--> * transaction ends would cause errors about plancache leaks.
<--><--><--> *

This comment is not fully accurate. If we try to save the plan, then
execution (with finished transaction inside) ends by segfault. Cached plan
is released on transaction end (by resource owner) and related memory
context is released. But next time this structure is accessed. There is
only a warning about unclosed plan cache (it maybe depends on other things).

I wrote a patch 2 that marks CALL statement related plans as "fragile". In
this case the plan is cached every time. There is a special mark "fragile",
that blocks immediately releasing related memory context, and it blocks
warnings and errors because for this case we expect closing plan cache by
resource owner or by SPI statement.

It reduces well CPU and memory overhead - execution time (in one big
transaction is only 8sec) - memory overhead is +/- 0

Patch2 is not too clear, and too readable although I think so it is more
correct. It better fixes SPI behaviour against new state - possibility to
commit, rollback inside procedures (inside SPI call).

All regress tests passed.

Regards

Pavel

diff --git a/src/backend/executor/spi.c b/src/backend/executor/spi.c
index 055ebb77ae..d6bbbc3dbd 100644
--- a/src/backend/executor/spi.c
+++ b/src/backend/executor/spi.c
@@ -24,6 +24,7 @@
 #include "executor/executor.h"
 #include "executor/spi_priv.h"
 #include "miscadmin.h"
+#include "storage/proc.h"
 #include "tcop/pquery.h"
 #include "tcop/utility.h"
 #include "utils/builtins.h"
@@ -2223,6 +2224,9 @@ _SPI_execute_plan(SPIPlanPtr plan, ParamListInfo paramLI,
 	CachedPlan *cplan = NULL;
 	ListCell   *lc1;
 
+	LocalTransactionId before_lxid = MyProc->lxid;
+	bool		is_fragile = plan->fragile;
+
 	/*
 	 * Setup error traceback support for ereport()
 	 */
@@ -2326,6 +2330,8 @@ _SPI_execute_plan(SPIPlanPtr plan, ParamListInfo paramLI,
 		 * plan, the refcount must be backed by the CurrentResourceOwner.
 		 */
 		cplan = GetCachedPlan(plansource, paramLI, plan->saved, _SPI_current->queryEnv);
+		cplan->is_fragile = is_fragile;
+
 		stmt_list = cplan->stmt_list;
 
 		/*
@@ -2520,8 +2526,19 @@ _SPI_execute_plan(SPIPlanPtr plan, ParamListInfo paramLI,
 			}
 		}
 
-		/* Done with this plan, so release refcount */
-		ReleaseCachedPlan(cplan, plan->saved);
+		if (is_fragile)
+		{
+			LocalTransactionId after_lxid  = MyProc->lxid;;
+
+			if (before_lxid == after_lxid)
+				ReleaseCachedPlan(cplan, plan->saved);
+
+			FinalReleaseFragileCachedPlan(cplan);
+		}
+		else
+			/* Done with this plan, so release refcount */
+			ReleaseCachedPlan(cplan, plan->saved);
+
 		cplan = NULL;
 
 		/*
@@ -2539,9 +2556,25 @@ fail:
 	if (pushed_active_snap)
 		PopActiveSnapshot();
 
-	/* We no longer need the cached plan refcount, if any */
 	if (cplan)
-		ReleaseCachedPlan(cplan, plan->saved);
+	{
+		if (is_fragile)
+		{
+			LocalTransactionId after_lxid  = MyProc->lxid;;
+
+			/*
+			 * Release plan, when the transaction is same. When
+			 * there are new transaction, then cached plan was
+			 * released by resource owner.
+			 */
+			if (before_lxid == after_lxid)
+				ReleaseCachedPlan(cplan, plan->saved);
+
+			FinalReleaseFragileCachedPlan(cplan);
+		}
+		else
+			ReleaseCachedPlan(cplan, plan->saved);
+	}
 
 	/*
 	 * Pop the error context stack
diff --git a/src/backend/utils/cache/plancache.c b/src/backend/utils/cache/plancache.c
index 75b475c179..de3ed036f2 100644
--- a/src/backend/utils/cache/plancache.c
+++ b/src/backend/utils/cache/plancache.c
@@ -528,10 +528,14 @@ ReleaseGenericPlan(CachedPlanSource *plansource)
 	if (plansource->gplan)
 	{
 		CachedPlan *plan = plansource->gplan;
+		bool		is_fragile = plan->is_fragile;
 
 		Assert(plan->magic == CACHEDPLAN_MAGIC);
 		plansource->gplan = NULL;
 		ReleaseCachedPlan(plan, false);
+
+		if (is_fragile)
+			FinalReleaseFragileCachedPlan(plan);
 	}
 }
 
@@ -1265,9 +1269,12 @@ ReleaseCachedPlan(CachedPlan *plan, bool useResOwner)
 		Assert(plan->is_saved);
 		ResourceOwnerForgetPlanCacheRef(CurrentResourceOwner, plan);
 	}
+
 	Assert(plan->refcount > 0);
 	plan->refcount--;
-	if (plan->refcount == 0)
+
+	/* We don't want to destroy fragile plans immediately */
+	if (plan->refcount == 0 && !plan->is_fragile)
 	{
 		/* Mark it no longer valid */
 		plan->magic = 0;
@@ -1278,6 +1285,23 @@ ReleaseCachedPlan(CachedPlan *plan, bool useResOwner)
 	}
 }
 
+void
+FinalReleaseFragileCachedPlan(CachedPlan *plan)
+{
+
+	Assert(plan->magic == CACHEDPLAN_MAGIC);
+	Assert(plan->is_fragile);
+
+	if (plan->refcount == 0)
+	{
+		plan->magic = 0;
+
+		/* One-shot plans do not own their context, so we can't free them */
+		if (!plan->is_oneshot)
+			MemoryContextDelete(plan->context);
+	}
+}
+
 /*
  * CachedPlanAllowsSimpleValidityCheck: can we use CachedPlanIsSimplyValid?
  *
diff --git a/src/backend/utils/resowner/resowner.c b/src/backend/utils/resowner/resowner.c
index 8bc2c4e9ea..b328729ebd 100644
--- a/src/backend/utils/resowner/resowner.c
+++ b/src/backend/utils/resowner/resowner.c
@@ -635,7 +635,13 @@ ResourceOwnerReleaseInternal(ResourceOwner owner,
 		{
 			CachedPlan *res = (CachedPlan *) DatumGetPointer(foundres);
 
-			if (isCommit)
+			/*
+			 * The transaction can be ended inside execution of CALL statements.
+			 * In this case, the caller cannot to clean own resources, and then
+			 * cleaning by resource owner is expected. Unfortunatelly a caller
+			 * routine doesn't know if this situation will be or not.
+			 */
+			if (isCommit && !res->is_fragile)
 				PrintPlanCacheLeakWarning(res);
 			ReleaseCachedPlan(res, true);
 		}
@@ -1140,7 +1146,7 @@ ResourceOwnerRememberPlanCacheRef(ResourceOwner owner, CachedPlan *plan)
 void
 ResourceOwnerForgetPlanCacheRef(ResourceOwner owner, CachedPlan *plan)
 {
-	if (!ResourceArrayRemove(&(owner->planrefarr), PointerGetDatum(plan)))
+	if (!ResourceArrayRemove(&(owner->planrefarr), PointerGetDatum(plan)) && !plan->is_fragile)
 		elog(ERROR, "plancache reference %p is not owned by resource owner %s",
 			 plan, owner->name);
 }
diff --git a/src/include/executor/spi_priv.h b/src/include/executor/spi_priv.h
index 6220928bd3..e1a01ecc0b 100644
--- a/src/include/executor/spi_priv.h
+++ b/src/include/executor/spi_priv.h
@@ -93,6 +93,7 @@ typedef struct _SPI_plan
 	bool		saved;			/* saved or unsaved plan? */
 	bool		oneshot;		/* one-shot plan? */
 	bool		no_snapshots;	/* let the caller handle the snapshots */
+	bool		fragile;		/* plancache can be released by resource owner */
 	List	   *plancache_list; /* one CachedPlanSource per parsetree */
 	MemoryContext plancxt;		/* Context containing _SPI_plan and data */
 	int			cursor_options; /* Cursor options used for planning */
diff --git a/src/include/utils/plancache.h b/src/include/utils/plancache.h
index 522020d763..d657eb0f17 100644
--- a/src/include/utils/plancache.h
+++ b/src/include/utils/plancache.h
@@ -150,6 +150,7 @@ typedef struct CachedPlan
 	bool		is_oneshot;		/* is it a "oneshot" plan? */
 	bool		is_saved;		/* is CachedPlan in a long-lived context? */
 	bool		is_valid;		/* is the stmt_list currently valid? */
+	bool		is_fragile;		/* cleaning by resource owner is expected */
 	Oid			planRoleId;		/* Role ID the plan was created for */
 	bool		dependsOnRole;	/* is plan specific to that role? */
 	TransactionId saved_xmin;	/* if valid, replan when TransactionXmin
@@ -222,6 +223,8 @@ extern CachedPlan *GetCachedPlan(CachedPlanSource *plansource,
 								 QueryEnvironment *queryEnv);
 extern void ReleaseCachedPlan(CachedPlan *plan, bool useResOwner);
 
+extern void FinalReleaseFragileCachedPlan(CachedPlan *plan);
+
 extern bool CachedPlanAllowsSimpleValidityCheck(CachedPlanSource *plansource,
 												CachedPlan *plan,
 												ResourceOwner owner);
diff --git a/src/pl/plpgsql/src/pl_exec.c b/src/pl/plpgsql/src/pl_exec.c
index d4a3d58daa..0f2ba6b0f6 100644
--- a/src/pl/plpgsql/src/pl_exec.c
+++ b/src/pl/plpgsql/src/pl_exec.c
@@ -2171,7 +2171,7 @@ exec_stmt_call(PLpgSQL_execstate *estate, PLpgSQL_stmt_call *stmt)
 			 * XXX This would be fixable with some plancache/resowner surgery
 			 * elsewhere, but for now we'll just work around this here.
 			 */
-			exec_prepare_plan(estate, expr, 0, estate->atomic);
+			exec_prepare_plan(estate, expr, 0, true);
 
 			/*
 			 * The procedure call could end transactions, which would upset
@@ -2181,6 +2181,15 @@ exec_stmt_call(PLpgSQL_execstate *estate, PLpgSQL_stmt_call *stmt)
 			plan = expr->plan;
 			plan->no_snapshots = true;
 
+			/*
+			 * Plancache have to be released on end of tranaction. But for this
+			 * case, transaction can be ended somewhere inside CALL statements
+			 * chain. We cannot to know where and why. So we should to mark
+			 * this plan as fragile. In this case is possible, there are
+			 * more actors are responsible to cleaning plan cache.
+			 */
+			plan->fragile = true;
+
 			/*
 			 * Force target to be recalculated whenever the plan changes, in
 			 * case the procedure's argument list has changed.

diff --git a/src/pl/plpgsql/src/pl_exec.c b/src/pl/plpgsql/src/pl_exec.c
index d4a3d58daa..22c0376795 100644
--- a/src/pl/plpgsql/src/pl_exec.c
+++ b/src/pl/plpgsql/src/pl_exec.c
@@ -2150,15 +2150,16 @@ static int
 exec_stmt_call(PLpgSQL_execstate *estate, PLpgSQL_stmt_call *stmt)
 {
 	PLpgSQL_expr *expr = stmt->expr;
+	volatile SPIPlanPtr	plan = expr->plan;
 	volatile LocalTransactionId before_lxid;
 	LocalTransactionId after_lxid;
 	volatile bool pushed_active_snap = false;
 	volatile int rc;
+	volatile bool plan_owner = false;
 
 	/* PG_TRY to ensure we clear the plan link, if needed, on failure */
 	PG_TRY();
 	{
-		SPIPlanPtr	plan = expr->plan;
 		ParamListInfo paramLI;
 
 		if (plan == NULL)
@@ -2173,6 +2174,16 @@ exec_stmt_call(PLpgSQL_execstate *estate, PLpgSQL_stmt_call *stmt)
 			 */
 			exec_prepare_plan(estate, expr, 0, estate->atomic);
 
+			/*
+			 * Not saved plan should be explicitly released. Without it, any
+			 * not recursive usage of CALL statemenst leak plan in SPI memory.
+			 * The created plan can be reused when procedure is called recursively,
+			 * and releasing plan can be done only in recursion root call, when
+			 * expression has not assigned plan. Where a plan was created, then
+			 * there plan can be released.
+			 */
+			plan_owner = true;
+
 			/*
 			 * The procedure call could end transactions, which would upset
 			 * the snapshot management in SPI_execute*, so don't let it do it.
@@ -2319,14 +2330,30 @@ exec_stmt_call(PLpgSQL_execstate *estate, PLpgSQL_stmt_call *stmt)
 		 * If we aren't saving the plan, unset the pointer.  Note that it
 		 * could have been unset already, in case of a recursive call.
 		 */
-		if (expr->plan && !expr->plan->saved)
+		if (plan && !plan->saved)
+		{
+			if (plan_owner)
+				SPI_freeplan(plan);
+
 			expr->plan = NULL;
+		}
 		PG_RE_THROW();
 	}
 	PG_END_TRY();
 
-	if (expr->plan && !expr->plan->saved)
+	if (plan && !plan->saved)
+	{
+		if (plan_owner)
+			SPI_freeplan(plan);
+
+		/*
+		 * If expr->plan is present, it must be the same plan that we
+		 * allocated
+		 */
+		Assert (!expr->plan || plan == expr->plan);
+
 		expr->plan = NULL;
+	}
 
 	if (rc < 0)
 		elog(ERROR, "SPI_execute_plan_with_paramlist failed executing query \"%s\": %s",

Re: calling procedures is slow and consumes extra much memory against calling function

Reply via email to