On Mon, May 06, 2024 at 02:23:24PM -0700, Noah Misch wrote:
> Here's how I've patched it locally.  It does avoid changing the backend-side,
> which has some attraction.  Shall I just push this?

It looks like you did not rebase on top of HEAD to avoid the spinlock
taken with InjectionPointDetach() in the shmem callback.  I think that
you'd mean the attached, once rebased (apologies I've reset the author
field).

+ *    s1: local-attach to POINT
+ *    s2: yield CPU before InjectionPointRun(POINT) calls injection_callback()
+ *    s1: exit()
+ *    s2: run POINT as though it had been non-local

I see.  So you are taking a shortcut in the shape of never resetting
the name of a condition, so as it is possible to let the point of step
4 check the runtime condition with a condition still stored while the
point has been detached, removed from the hash table.

         if (strcmp(condition->name, name) == 0)
         {
+            condition->valid = false;
             condition->pid = 0;
-            condition->name[0] = '\0';
         }
     }

As of HEAD, we rely on InjectionPointCondition->name to be set to
check if a condition is valid.  Your patch adds a different variable
to do mostly the same thing, and never clears the name in any existing
conditions.  A side effect is that this causes the conditions to pile
up on a running server when running installcheck, and assuming that
many test suites are run on a server left running this could cause
spurious failures when failing to find a new slot.  Always resetting
condition->name when detaching a point is a simpler flow and saner
IMO.

Overall, this switches from one detach behavior to a different one,
which may or may not be intuitive depending on what one is looking
for.  FWIW, I see InjectionPointCondition as something that should be
around as long as its injection point exists, with the condition
entirely gone once the point is detached because it should not exist
anymore on the server running, with no information left in shmem.

Through your patch, you make conditions have a different meaning, with
a mix of "local" definition, but it is kind of permanent as it keeps a
trace of the point's name in shmem.  I find the behavior of the patch
less intuitive.  Perhaps it would be interesting to see first the bug
and/or problem you are trying to tackle with this different behavior
as I feel like we could do something even with the module as-is.  As
far as I understand, the implementation of the module on HEAD allows
one to emulate a breakpoint with a wait/wake, which can avoid the
window mentioned in step 2.  Even if a wait point is detached
concurrently, it can be awaken with its traces in shmem removed.
--
Michael
From 9cf38b2f6dadd5b4bb81fe5ccea350d259e6a241 Mon Sep 17 00:00:00 2001
From: Michael Paquier <mich...@paquier.xyz>
Date: Tue, 7 May 2024 10:15:28 +0900
Subject: [PATCH v2] Fix race condition in backend exit after
 injection_points_set_local().

Symptoms were like those prompting commit
f587338dec87d3c35b025e131c5977930ac69077.  That is, under installcheck,
backends from unrelated tests could run the injection points.

Reviewed by FIXME.

Discussion: https://postgr.es/m/FIXME
---
 .../injection_points/injection_points.c       | 27 ++++++++++++++-----
 1 file changed, 20 insertions(+), 7 deletions(-)

diff --git a/src/test/modules/injection_points/injection_points.c b/src/test/modules/injection_points/injection_points.c
index a74a4a28af..8c9a3ebd74 100644
--- a/src/test/modules/injection_points/injection_points.c
+++ b/src/test/modules/injection_points/injection_points.c
@@ -37,7 +37,13 @@ PG_MODULE_MAGIC;
 
 /*
  * Conditions related to injection points.  This tracks in shared memory the
- * runtime conditions under which an injection point is allowed to run.
+ * runtime conditions under which an injection point is allowed to run.  Once
+ * set, a name is never changed.  That avoids a race condition:
+ *
+ *	s1: local-attach to POINT
+ *	s2: yield CPU before InjectionPointRun(POINT) calls injection_callback()
+ *	s1: exit()
+ *	s2: run POINT as though it had been non-local
  *
  * If more types of runtime conditions need to be tracked, this structure
  * should be expanded.
@@ -49,6 +55,9 @@ typedef struct InjectionPointCondition
 
 	/* ID of the process where the injection point is allowed to run */
 	int			pid;
+
+	/* Should "pid" run this injection point, or not? */
+	bool		valid;
 } InjectionPointCondition;
 
 /* Shared state information for injection points. */
@@ -133,7 +142,7 @@ injection_point_allowed(const char *name)
 	{
 		InjectionPointCondition *condition = &inj_state->conditions[i];
 
-		if (strcmp(condition->name, name) == 0)
+		if (condition->valid && strcmp(condition->name, name) == 0)
 		{
 			/*
 			 * Check if this injection point is allowed to run in this
@@ -175,9 +184,11 @@ injection_points_cleanup(int code, Datum arg)
 	{
 		InjectionPointCondition *condition = &inj_state->conditions[i];
 
-		if (condition->name[0] == '\0')
+		if (!condition->valid)
 			continue;
 
+		Assert(condition->name[0] != '\0');
+
 		if (condition->pid != MyProcPid)
 			continue;
 
@@ -197,14 +208,14 @@ injection_points_cleanup(int code, Datum arg)
 	{
 		InjectionPointCondition *condition = &inj_state->conditions[i];
 
-		if (condition->name[0] == '\0')
+		if (condition->valid)
 			continue;
 
 		if (condition->pid != MyProcPid)
 			continue;
 
-		condition->name[0] = '\0';
 		condition->pid = 0;
+		condition->valid = false;
 	}
 	SpinLockRelease(&inj_state->lock);
 }
@@ -326,11 +337,13 @@ injection_points_attach(PG_FUNCTION_ARGS)
 		{
 			InjectionPointCondition *condition = &inj_state->conditions[i];
 
-			if (condition->name[0] == '\0')
+			if (strcmp(condition->name, name) == 0 ||
+				condition->name[0] == '\0')
 			{
 				index = i;
 				strlcpy(condition->name, name, INJ_NAME_MAXLEN);
 				condition->pid = MyProcPid;
+				condition->valid = true;
 				break;
 			}
 		}
@@ -444,7 +457,7 @@ injection_points_detach(PG_FUNCTION_ARGS)
 		if (strcmp(condition->name, name) == 0)
 		{
 			condition->pid = 0;
-			condition->name[0] = '\0';
+			condition->valid = false;
 		}
 	}
 	SpinLockRelease(&inj_state->lock);
-- 
2.43.0

Attachment: signature.asc
Description: PGP signature

Reply via email to