Anthonin Bonnefoy <[email protected]> writes:
> I've updated the patch with another approach:
I got back to looking at this today. I still don't like very much
about it. After thinking for awhile, I concur that we want to create
the cmd_context under TopMemoryContext, but I think we can make things
a great deal simpler and less fragile by just keeping it in existence
indefinitely, as attached. The fundamental problem here is that
exec_replication_command is creating and deleting the context without
regard to where transaction boundaries are. I don't think the best
way to deal with that is to require it to know where those boundaries
are.
I also don't like the proposed test script. The test case alone would
be fine, but spinning up an entire new instance seems a rather huge
overhead to carry forevermore for a single test that likely will never
find anything again. I tried to cram the test case into one of the
existing scripts, but it was hard to find one where it would fit
naturally. An obvious candidate is subscription/t/100_bugs.pl, but
the test failed there for lack of access to the test_decoding plugin.
Perhaps we could use another plugin, but I didn't try hard.
As for the memory-context-bug-catching changes, I'm not very convinced
by the idea of setting context->type = T_Invalid. That would help
if someone tries to reference the context while it's still in the
freelist, but not if the reference is from re-use of a stale pointer
after the context has been reassigned (which IIUC is the case here).
I also think that decorating a few MemoryContextSwitchTos with
assertions looks pretty random. I wonder if we should content
ourselves with adding some assertions that catch attempts to make
a context be its own parent. It looks like MemoryContextSetParent
already does that, so the only new assert would be in
MemoryContextCreate.
regards, tom lane
diff --git a/src/backend/replication/walsender.c b/src/backend/replication/walsender.c
index 987a7336ec8..9fa8beb6103 100644
--- a/src/backend/replication/walsender.c
+++ b/src/backend/replication/walsender.c
@@ -1973,8 +1973,10 @@ exec_replication_command(const char *cmd_string)
int parse_rc;
Node *cmd_node;
const char *cmdtag;
- MemoryContext cmd_context;
- MemoryContext old_context;
+ MemoryContext old_context = CurrentMemoryContext;
+
+ /* We save and re-use the cmd_context across calls */
+ static MemoryContext cmd_context = NULL;
/*
* If WAL sender has been told that shutdown is getting close, switch its
@@ -2003,11 +2005,30 @@ exec_replication_command(const char *cmd_string)
/*
* Prepare to parse and execute the command.
+ *
+ * Because replication command execution can involve beginning or ending
+ * transactions, we need a working context that will survive that, so we
+ * make it a child of TopMemoryContext. That in turn creates a hazard of
+ * long-lived memory leaks if we lose track of the working context. We
+ * deal with that by creating it only once per walsender, and resetting it
+ * for each new command. (Normally this reset is a no-op, but if the
+ * prior exec_replication_command call failed with an error, it won't be.)
+ *
+ * This is subtler than it looks. The transactions we manage can extend
+ * across replication commands, indeed SnapBuildClearExportedSnapshot
+ * might have just ended one. Because transaction exit will revert to the
+ * memory context that was current at transaction start, we need to be
+ * sure that that context is still valid. That motivates re-using the
+ * same cmd_context rather than making a new one each time.
*/
- cmd_context = AllocSetContextCreate(CurrentMemoryContext,
- "Replication command context",
- ALLOCSET_DEFAULT_SIZES);
- old_context = MemoryContextSwitchTo(cmd_context);
+ if (cmd_context == NULL)
+ cmd_context = AllocSetContextCreate(TopMemoryContext,
+ "Replication command context",
+ ALLOCSET_DEFAULT_SIZES);
+ else
+ MemoryContextReset(cmd_context);
+
+ MemoryContextSwitchTo(cmd_context);
replication_scanner_init(cmd_string, &scanner);
@@ -2020,7 +2041,7 @@ exec_replication_command(const char *cmd_string)
replication_scanner_finish(scanner);
MemoryContextSwitchTo(old_context);
- MemoryContextDelete(cmd_context);
+ MemoryContextReset(cmd_context);
/* XXX this is a pretty random place to make this check */
if (MyDatabaseId == InvalidOid)
@@ -2180,9 +2201,12 @@ exec_replication_command(const char *cmd_string)
cmd_node->type);
}
- /* done */
+ /*
+ * Done. Revert to caller's memory context, and clean out the cmd_context
+ * to recover memory right away.
+ */
MemoryContextSwitchTo(old_context);
- MemoryContextDelete(cmd_context);
+ MemoryContextReset(cmd_context);
/*
* We need not update ps display or pg_stat_activity, because PostgresMain