On Mon, May 06, 2024 at 02:23:24PM -0700, Noah Misch wrote: > Here's how I've patched it locally. It does avoid changing the backend-side, > which has some attraction. Shall I just push this?
It looks like you did not rebase on top of HEAD to avoid the spinlock taken with InjectionPointDetach() in the shmem callback. I think that you'd mean the attached, once rebased (apologies I've reset the author field). + * s1: local-attach to POINT + * s2: yield CPU before InjectionPointRun(POINT) calls injection_callback() + * s1: exit() + * s2: run POINT as though it had been non-local I see. So you are taking a shortcut in the shape of never resetting the name of a condition, so as it is possible to let the point of step 4 check the runtime condition with a condition still stored while the point has been detached, removed from the hash table. if (strcmp(condition->name, name) == 0) { + condition->valid = false; condition->pid = 0; - condition->name[0] = '\0'; } } As of HEAD, we rely on InjectionPointCondition->name to be set to check if a condition is valid. Your patch adds a different variable to do mostly the same thing, and never clears the name in any existing conditions. A side effect is that this causes the conditions to pile up on a running server when running installcheck, and assuming that many test suites are run on a server left running this could cause spurious failures when failing to find a new slot. Always resetting condition->name when detaching a point is a simpler flow and saner IMO. Overall, this switches from one detach behavior to a different one, which may or may not be intuitive depending on what one is looking for. FWIW, I see InjectionPointCondition as something that should be around as long as its injection point exists, with the condition entirely gone once the point is detached because it should not exist anymore on the server running, with no information left in shmem. Through your patch, you make conditions have a different meaning, with a mix of "local" definition, but it is kind of permanent as it keeps a trace of the point's name in shmem. I find the behavior of the patch less intuitive. Perhaps it would be interesting to see first the bug and/or problem you are trying to tackle with this different behavior as I feel like we could do something even with the module as-is. As far as I understand, the implementation of the module on HEAD allows one to emulate a breakpoint with a wait/wake, which can avoid the window mentioned in step 2. Even if a wait point is detached concurrently, it can be awaken with its traces in shmem removed. -- Michael
From 9cf38b2f6dadd5b4bb81fe5ccea350d259e6a241 Mon Sep 17 00:00:00 2001 From: Michael Paquier <mich...@paquier.xyz> Date: Tue, 7 May 2024 10:15:28 +0900 Subject: [PATCH v2] Fix race condition in backend exit after injection_points_set_local(). Symptoms were like those prompting commit f587338dec87d3c35b025e131c5977930ac69077. That is, under installcheck, backends from unrelated tests could run the injection points. Reviewed by FIXME. Discussion: https://postgr.es/m/FIXME --- .../injection_points/injection_points.c | 27 ++++++++++++++----- 1 file changed, 20 insertions(+), 7 deletions(-) diff --git a/src/test/modules/injection_points/injection_points.c b/src/test/modules/injection_points/injection_points.c index a74a4a28af..8c9a3ebd74 100644 --- a/src/test/modules/injection_points/injection_points.c +++ b/src/test/modules/injection_points/injection_points.c @@ -37,7 +37,13 @@ PG_MODULE_MAGIC; /* * Conditions related to injection points. This tracks in shared memory the - * runtime conditions under which an injection point is allowed to run. + * runtime conditions under which an injection point is allowed to run. Once + * set, a name is never changed. That avoids a race condition: + * + * s1: local-attach to POINT + * s2: yield CPU before InjectionPointRun(POINT) calls injection_callback() + * s1: exit() + * s2: run POINT as though it had been non-local * * If more types of runtime conditions need to be tracked, this structure * should be expanded. @@ -49,6 +55,9 @@ typedef struct InjectionPointCondition /* ID of the process where the injection point is allowed to run */ int pid; + + /* Should "pid" run this injection point, or not? */ + bool valid; } InjectionPointCondition; /* Shared state information for injection points. */ @@ -133,7 +142,7 @@ injection_point_allowed(const char *name) { InjectionPointCondition *condition = &inj_state->conditions[i]; - if (strcmp(condition->name, name) == 0) + if (condition->valid && strcmp(condition->name, name) == 0) { /* * Check if this injection point is allowed to run in this @@ -175,9 +184,11 @@ injection_points_cleanup(int code, Datum arg) { InjectionPointCondition *condition = &inj_state->conditions[i]; - if (condition->name[0] == '\0') + if (!condition->valid) continue; + Assert(condition->name[0] != '\0'); + if (condition->pid != MyProcPid) continue; @@ -197,14 +208,14 @@ injection_points_cleanup(int code, Datum arg) { InjectionPointCondition *condition = &inj_state->conditions[i]; - if (condition->name[0] == '\0') + if (condition->valid) continue; if (condition->pid != MyProcPid) continue; - condition->name[0] = '\0'; condition->pid = 0; + condition->valid = false; } SpinLockRelease(&inj_state->lock); } @@ -326,11 +337,13 @@ injection_points_attach(PG_FUNCTION_ARGS) { InjectionPointCondition *condition = &inj_state->conditions[i]; - if (condition->name[0] == '\0') + if (strcmp(condition->name, name) == 0 || + condition->name[0] == '\0') { index = i; strlcpy(condition->name, name, INJ_NAME_MAXLEN); condition->pid = MyProcPid; + condition->valid = true; break; } } @@ -444,7 +457,7 @@ injection_points_detach(PG_FUNCTION_ARGS) if (strcmp(condition->name, name) == 0) { condition->pid = 0; - condition->name[0] = '\0'; + condition->valid = false; } } SpinLockRelease(&inj_state->lock); -- 2.43.0
signature.asc
Description: PGP signature