On Wed, 2 Jul 2025 at 13:21, Zhijie Hou (Fujitsu)
<houzj.f...@fujitsu.com> wrote:
>
> On Wed, Jul 2, 2025 at 2:42 PM vignesh C wrote:
>
> >
> > Hi,
> >
> > I encountered an invalid pointer access issue. Below are the steps to
> > reproduce the issue:
> ...
> > The error occurs because entry->columns is allocated in the entry
> > private context (entry->entry_cxt) by pub_collist_to_bitmapset(). This
> > context is a child of the PortalContext, which is cleared after an
> > error via: AbortTransaction
> > -> AtAbort_Portals  ->
> > MemoryContextDeleteChildren  -> MemoryContextDelete   ->
> > MemoryContextDeleteOnly
> > As a result, the memory backing entry->columns is freed, but the
> > RelationSyncCache which resides in CacheMemoryContext and thus
> > survives the error still holds a dangling pointer to this freed
> > memory, causing it to pfree an invalid pointer.
> > In the normal (positive) execution flow, pgoutput_shutdown() is called
> > to clean up the RelationSyncCache. This happens via:
> > FreeDecodingContext -> shutdown_cb_wrapper -> pgoutput_shutdown But
> > this is not called in case of an error case. To handle this case
> > safely, I suggest calling FreeDecodingContext in the PG_CATCH block to
> > ensure pgoutput_shutdown is invoked and the stale cache is cleared 
> > appropriately.
> > Attached patch has the changes for the same.
> > Thoughts?
>
> Thank you for reporting the issue and providing a fix.
>
> I recall that we identified this general issue with the hash table in pgoutput
> in other threads as well [1]. The basic consensus [2] is that calling
> FreeDecodingContext() within PG_CATCH is not ideal, as this function includes
> user code, increasing the risk of encountering another error within PG_CATCH.
> This scenario could prevent execution of subsequent code to invalidate 
> syscache
> entries, which is problematic.

Yes, let's avoid this.

> I think a better fix could be to introduce a memory context reset callback(on
> data->cachectx) and perform the actions of pgoutput_shutdown() within it.

The attached v2 version patch has the changes for the same.

Regards,
Vignesh
From dea5cc35c7ad2a86e86c28096b50f132d51bca57 Mon Sep 17 00:00:00 2001
From: Vignesh C <vignes...@gmail.com>
Date: Thu, 3 Jul 2025 19:48:28 +0530
Subject: [PATCH v2] Fix referencing invalid pointer in logical decoding after
 error

When an error occurs while processing changes with conflicting
column lists in different publications, the entry->columns memory
which is allocated in entry private context is freed. However, the
RelationSyncCache still holds a pointer to this memory, leading to
a crash on subsequent access. Fix this by clearing the cached relation
info through MemoryContext reset callback.
---
 src/backend/replication/pgoutput/pgoutput.c | 18 ++++++++++++++++++
 1 file changed, 18 insertions(+)

diff --git a/src/backend/replication/pgoutput/pgoutput.c b/src/backend/replication/pgoutput/pgoutput.c
index 082b4d9d327..1d00587b715 100644
--- a/src/backend/replication/pgoutput/pgoutput.c
+++ b/src/backend/replication/pgoutput/pgoutput.c
@@ -424,6 +424,19 @@ parse_output_parameters(List *options, PGOutputData *data)
 				errmsg("option \"%s\" missing", "publication_names"));
 }
 
+/*
+ * Memory context reset callback for clearing the cached relation info.
+ */
+static void
+cache_context_callback(void *arg)
+{
+	if (RelationSyncCache)
+	{
+		hash_destroy(RelationSyncCache);
+		RelationSyncCache = NULL;
+	}
+}
+
 /*
  * Initialize this plugin
  */
@@ -433,6 +446,7 @@ pgoutput_startup(LogicalDecodingContext *ctx, OutputPluginOptions *opt,
 {
 	PGOutputData *data = palloc0(sizeof(PGOutputData));
 	static bool publication_callback_registered = false;
+	MemoryContextCallback *mcallback;
 
 	/* Create our memory context for private allocations. */
 	data->context = AllocSetContextCreate(ctx->context,
@@ -447,6 +461,10 @@ pgoutput_startup(LogicalDecodingContext *ctx, OutputPluginOptions *opt,
 										 "logical replication publication list context",
 										 ALLOCSET_SMALL_SIZES);
 
+	mcallback = palloc0(sizeof(MemoryContextCallback));
+	mcallback->func = cache_context_callback;
+	MemoryContextRegisterResetCallback(data->cachectx, mcallback);
+
 	ctx->output_plugin_private = data;
 
 	/* This plugin uses binary protocol. */
-- 
2.43.0

Reply via email to