On Thu, 5 Jun 2025 16:26:08 +0900
Yugo Nagata <nag...@sraoss.co.jp> wrote:

> On Tue, 3 Jun 2025 17:39:50 +0900
> Yugo Nagata <nag...@sraoss.co.jp> wrote:
> 
> > On Tue, 27 May 2025 09:00:01 +0300
> > Alexander Lakhin <exclus...@gmail.com> wrote:
> 
> > I know there are other scenarios where the same is raises and I agree that
> > it would be better to consider a more global solution instead of addressing
> > each of them. However, I am not sure that improving the error message for
> > each case doesn't not make sense.
> 
> To address the remaining cases where DDL commands fail with the internal
> error "ERROR:  tuple concurrently updated" due to insufficient locking,
> I would like to propose improving the error reporting to produce a more
> appropriate and user-facing error message. This should make it easier for
> users to understand the cause of the failure.
> 
> Patch 0003 improves the error message shown when concurrent updates to a
> system catalog tuple occur, producing output like:
> 
>  ERROR:  operation failed due to a concurrent command
>  DETAIL: Another command modified the same object in a concurrent session.
> 
> Patches 0001 and 0002 are unchanged from v2, except for updated commit 
> messages.
> I believe these patches are still useful, as they allow the operation to 
> complete
> successfully after waiting, or to behave appropriately when the target 
> function
> is dropped by another session during the wait.

I found that the error "tuple concurrently updated" was expected as the results
of injection_points test , so I've fixed it so that the new message is expected
instead.

I've attached updated patches.

Best regards,
Yugo Nagata

-- 
Yugo Nagata <nag...@sraoss.co.jp>
>From b7ba94bb6ae4e646114b559b31f52e4c02958b6c Mon Sep 17 00:00:00 2001
From: Yugo Nagata <nag...@sraoss.co.jp>
Date: Thu, 5 Jun 2025 15:32:37 +0900
Subject: [PATCH v4 3/3] Improve error reporting for concurrent updates on
 system catalog tuples

Previously, when multiple sessions attempted to modify the same system
catalog tuple concurrently due to insufficient locking, DDL commands could
fail with an internal error:

ERROR:  tuple concurrently updated

This commit improves the behavior by reporting a more appropriate and
user-facing error message in such cases, making it easier for users to
understand the cause of the failure.
---
 src/backend/access/heap/heapam.c              | 34 ++++++++++++++++---
 .../expected/syscache-update-pruned.out       |  2 +-
 2 files changed, 31 insertions(+), 5 deletions(-)

diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c
index 0dcd6ee817e..712b7ded6d2 100644
--- a/src/backend/access/heap/heapam.c
+++ b/src/backend/access/heap/heapam.c
@@ -3194,6 +3194,7 @@ simple_heap_delete(Relation relation, ItemPointer tid)
 {
 	TM_Result	result;
 	TM_FailureData tmfd;
+	bool is_catalog = IsCatalogRelation(relation);
 
 	result = heap_delete(relation, tid,
 						 GetCurrentCommandId(true), InvalidSnapshot,
@@ -3211,11 +3212,23 @@ simple_heap_delete(Relation relation, ItemPointer tid)
 			break;
 
 		case TM_Updated:
-			elog(ERROR, "tuple concurrently updated");
+			if (is_catalog)
+				ereport(ERROR,
+						(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+						 errmsg("operation failed due to a concurrent command"),
+						 errdetail("Another command modified the same object in a concurrent session.")));
+			else
+				elog(ERROR, "tuple concurrently updated");
 			break;
 
 		case TM_Deleted:
-			elog(ERROR, "tuple concurrently deleted");
+			if (is_catalog)
+				ereport(ERROR,
+						(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+						 errmsg("operation failed due to a concurrent command"),
+						 errdetail("Another command dropped the same object in a concurrent session.")));
+			else
+				elog(ERROR, "tuple concurrently deleted");
 			break;
 
 		default:
@@ -4486,6 +4499,7 @@ simple_heap_update(Relation relation, ItemPointer otid, HeapTuple tup,
 	TM_Result	result;
 	TM_FailureData tmfd;
 	LockTupleMode lockmode;
+	bool is_catalog = IsCatalogRelation(relation);
 
 	result = heap_update(relation, otid, tup,
 						 GetCurrentCommandId(true), InvalidSnapshot,
@@ -4503,11 +4517,23 @@ simple_heap_update(Relation relation, ItemPointer otid, HeapTuple tup,
 			break;
 
 		case TM_Updated:
-			elog(ERROR, "tuple concurrently updated");
+			if (is_catalog)
+				ereport(ERROR,
+						(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+						 errmsg("operation failed due to a concurrent command"),
+						 errdetail("Another command modified the same object in a concurrent session.")));
+			else
+				elog(ERROR, "tuple concurrently updated");
 			break;
 
 		case TM_Deleted:
-			elog(ERROR, "tuple concurrently deleted");
+			if (is_catalog)
+				ereport(ERROR,
+						(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+						 errmsg("operation failed due to a concurrent command"),
+						 errdetail("Another command dropped the same object in a concurrent session.")));
+			else
+				elog(ERROR, "tuple concurrently deleted");
 			break;
 
 		default:
diff --git a/src/test/modules/injection_points/expected/syscache-update-pruned.out b/src/test/modules/injection_points/expected/syscache-update-pruned.out
index a6a4e8db996..231545a6cbb 100644
--- a/src/test/modules/injection_points/expected/syscache-update-pruned.out
+++ b/src/test/modules/injection_points/expected/syscache-update-pruned.out
@@ -20,7 +20,7 @@ step wakegrant4:
 	SELECT FROM injection_points_wakeup('heap_update-before-pin');
  <waiting ...>
 step grant1: <... completed>
-ERROR:  tuple concurrently deleted
+ERROR:  operation failed due to a concurrent command
 step wakegrant4: <... completed>
 
 starting permutation: cachefill1 at2 waitprunable4 vac4 grant1 wakeinval4 mkrels4 wakegrant4
-- 
2.43.0

>From 56070e0ab5a73cc090f79850af40b418ddf53645 Mon Sep 17 00:00:00 2001
From: Yugo Nagata <nag...@sraoss.co.jp>
Date: Mon, 31 Mar 2025 20:17:07 +0900
Subject: [PATCH v4 2/3] Prevent internal error caused by concurrent ALTER
 FUNCTION

Previously, concurrent ALTER FUNCTION commands could fail with an
internal error "tuple concurrently updated". This occurred
because multiple sessions attempted to modify the same catalog tuple
simultaneously.

To prevent this, ensure that an exclusive lock on the function object
is acquired earlier in the process.

Additionally, if the target function is dropped by another session while
waiting for the lock, an appropriate error is raised to indicate the
object no longer exists.
---
 src/backend/commands/functioncmds.c | 17 +++++++++++++++--
 1 file changed, 15 insertions(+), 2 deletions(-)

diff --git a/src/backend/commands/functioncmds.c b/src/backend/commands/functioncmds.c
index 0335e982b31..5bb03153c0d 100644
--- a/src/backend/commands/functioncmds.c
+++ b/src/backend/commands/functioncmds.c
@@ -61,6 +61,7 @@
 #include "parser/parse_func.h"
 #include "parser/parse_type.h"
 #include "pgstat.h"
+#include "storage/lmgr.h"
 #include "tcop/pquery.h"
 #include "tcop/utility.h"
 #include "utils/acl.h"
@@ -1383,9 +1384,21 @@ AlterFunction(ParseState *pstate, AlterFunctionStmt *stmt)
 
 	ObjectAddressSet(address, ProcedureRelationId, funcOid);
 
+	/* Lock the function so nobody else can do anything with it. */
+	LockDatabaseObject(ProcedureRelationId, funcOid, 0, AccessExclusiveLock);
+
+	/*
+	 * It is possible that by the time we acquire the lock on function,
+	 * concurrent DDL has removed it. We can test this by checking the
+	 * existence of function. We get the tuple again to avoid the risk
+	 * of function definition getting changed.
+	 */
 	tup = SearchSysCacheCopy1(PROCOID, ObjectIdGetDatum(funcOid));
-	if (!HeapTupleIsValid(tup)) /* should not happen */
-		elog(ERROR, "cache lookup failed for function %u", funcOid);
+	if (!HeapTupleIsValid(tup))
+		ereport(ERROR,
+				errcode(ERRCODE_UNDEFINED_OBJECT),
+				errmsg("function \"%s\" does not exist",
+					   NameListToString(stmt->func->objname)));
 
 	procForm = (Form_pg_proc) GETSTRUCT(tup);
 
-- 
2.43.0

>From 6c11eca73b0b5b847edf5e1eb5840ee692b1e5fe Mon Sep 17 00:00:00 2001
From: Yugo Nagata <nag...@sraoss.co.jp>
Date: Mon, 31 Mar 2025 18:46:58 +0900
Subject: [PATCH v4 1/3] Prevent internal error at concurrent CREATE OR REPLACE
 FUNCTION

Previously, concurrent CREATE OR REPLACE FUNCTION commands could fail
with an internal error "tuple concurrently updated". This occurred
because multiple sessions attempted to modify the same catalog tuple
simultaneously.

To prevent this, ensure that an exclusive lock on the function object
is acquired earlier in the process.

Additionally, if the target function is dropped by another session while
waiting for the lock, a new function is created instead.
---
 src/backend/catalog/pg_proc.c | 38 +++++++++++++++++++++++++++--------
 1 file changed, 30 insertions(+), 8 deletions(-)

diff --git a/src/backend/catalog/pg_proc.c b/src/backend/catalog/pg_proc.c
index 5fdcf24d5f8..a35696480c7 100644
--- a/src/backend/catalog/pg_proc.c
+++ b/src/backend/catalog/pg_proc.c
@@ -34,6 +34,7 @@
 #include "parser/parse_coerce.h"
 #include "pgstat.h"
 #include "rewrite/rewriteHandler.h"
+#include "storage/lmgr.h"
 #include "tcop/pquery.h"
 #include "tcop/tcopprot.h"
 #include "utils/acl.h"
@@ -383,24 +384,45 @@ ProcedureCreate(const char *procedureName,
 	tupDesc = RelationGetDescr(rel);
 
 	/* Check for pre-existing definition */
-	oldtup = SearchSysCache3(PROCNAMEARGSNSP,
-							 PointerGetDatum(procedureName),
-							 PointerGetDatum(parameterTypes),
-							 ObjectIdGetDatum(procNamespace));
+	oldtup = SearchSysCacheCopy3(PROCNAMEARGSNSP,
+								 PointerGetDatum(procedureName),
+								 PointerGetDatum(parameterTypes),
+								 ObjectIdGetDatum(procNamespace));
 
 	if (HeapTupleIsValid(oldtup))
 	{
 		/* There is one; okay to replace it? */
 		Form_pg_proc oldproc = (Form_pg_proc) GETSTRUCT(oldtup);
-		Datum		proargnames;
-		bool		isnull;
-		const char *dropcmd;
 
 		if (!replace)
 			ereport(ERROR,
 					(errcode(ERRCODE_DUPLICATE_FUNCTION),
 					 errmsg("function \"%s\" already exists with same argument types",
 							procedureName)));
+
+		/* Lock the function so nobody else can do anything with it. */
+		LockDatabaseObject(ProcedureRelationId, oldproc->oid, 0, AccessExclusiveLock);
+
+		/*
+		 * It is possible that by the time we acquire the lock on function,
+		 * concurrent DDL has removed it. We can test this by checking the
+		 * existence of function. We get the tuple again to avoid the risk
+		 * of function definition getting changed.
+		 */
+		oldtup = SearchSysCacheCopy3(PROCNAMEARGSNSP,
+									 PointerGetDatum(procedureName),
+									 PointerGetDatum(parameterTypes),
+									 ObjectIdGetDatum(procNamespace));
+	}
+
+	if (HeapTupleIsValid(oldtup))
+	{
+
+		Form_pg_proc oldproc = (Form_pg_proc) GETSTRUCT(oldtup);
+		Datum           proargnames;
+		bool            isnull;
+		const char *dropcmd;
+
 		if (!object_ownercheck(ProcedureRelationId, oldproc->oid, proowner))
 			aclcheck_error(ACLCHECK_NOT_OWNER, OBJECT_FUNCTION,
 						   procedureName);
@@ -585,7 +607,7 @@ ProcedureCreate(const char *procedureName,
 		tup = heap_modify_tuple(oldtup, tupDesc, values, nulls, replaces);
 		CatalogTupleUpdate(rel, &tup->t_self, tup);
 
-		ReleaseSysCache(oldtup);
+		heap_freetuple(oldtup);
 		is_update = true;
 	}
 	else
-- 
2.43.0

Reply via email to