Re: Why our Valgrind reports suck

Tom Lane Sun, 11 May 2025 12:11:35 -0700

I wrote:
> And, since there's nothing new under the sun around here,
> we already had a discussion about that back in 2021:
> https://www.postgresql.org/message-id/flat/3471359.1615937770%40sss.pgh.pa.us
> That thread seems to have led to fixing some specific bugs,
> but we never committed any of the discussed valgrind infrastructure
> improvements.  I'll have a go at resurrecting that...


Okay, here is a patch series that updates the
0001-Make-memory-contexts-themselves-more-visible-to-valg.patch
patch you posted in that thread, and makes various follow-up
fixes that either fix or paper over various leaks.  Some of it
is committable I think, but other parts are just WIP.  Anyway,
as of the 0010 patch we can run through the core regression tests
and see no more than a couple of kilobytes total reported leakage
in any process, except for two tests that expose leaks in TS
dictionary building.  (That's fixable but I ran out of time,
and I wanted to get this posted before Montreal.)  There is
work left to do before we can remove the suppressions added in
0002, but this is already huge progress compared to where we were.

A couple of these patches are bug fixes that need to be applied and
even back-patched.  In particular, I had not realized that autovacuum
leaks a nontrivial amount of memory per relation processed (cf 0009),
and apparently has done for a few releases now.  This is horrid in
databases with many tables, and I'm surprised that we've not gotten
complaints about it.

                        regards, tom lane

From d4ca3e2e4824ff1f11ee5324b4ba46c4dd971ce9 Mon Sep 17 00:00:00 2001
From: Tom Lane <t...@sss.pgh.pa.us>
Date: Sun, 11 May 2025 12:39:59 -0400
Subject: [PATCH v1 01/11] Improve our support for Valgrind's leak tracking.

When determining whether an allocated chunk is still reachable,
Valgrind will consider only pointers within what it believes to be
allocated chunks.  Normally, all of a block obtained from malloc()
would be considered "allocated" --- but it turns out that if we use
VALGRIND_MEMPOOL_ALLOC to designate sub-section(s) of a malloc'ed
block as allocated, all the rest of that malloc'ed block is ignored.
This leads to lots of false positives of course.  In particular,
in any multi-malloc-block context, all but the primary block were
reported as leaked.  We also had a problem with context "ident"
strings, which were reported as leaked unless there was some other
pointer to them besides the one in the context header.

To fix, we need to use VALGRIND_MEMPOOL_ALLOC to designate
a context's management structs (the context struct itself and
any per-block headers) as allocated chunks.  That forces moving
the VALGRIND_CREATE_MEMPOOL/VALGRIND_DESTROY_MEMPOOL calls into
the per-context-type code, so that the pool identifier can be
made as soon as we've allocated the initial block, but otherwise
it's fairly straightforward.  Note that in Valgrind's eyes there
is no distinction between these allocations and the allocations
that the mmgr modules hand out to user code.  That's fine for
now, but perhaps someday we'll want to do better yet.

Author: Andres Freund <and...@anarazel.de>
Co-authored-by: Tom Lane <t...@sss.pgh.pa.us>
Discussion: https://postgr.es/m/285483.1746756...@sss.pgh.pa.us
Discussion: https://postgr.es/m/20210317181531.7oggpqevzz6bk...@alap3.anarazel.de
---
 src/backend/utils/mmgr/aset.c       | 69 ++++++++++++++++++++++++++++-
 src/backend/utils/mmgr/bump.c       | 31 ++++++++++++-
 src/backend/utils/mmgr/generation.c | 29 ++++++++++++
 src/backend/utils/mmgr/mcxt.c       | 15 ++++---
 src/backend/utils/mmgr/slab.c       | 32 +++++++++++++
 src/include/utils/memdebug.h        |  1 +
 6 files changed, 168 insertions(+), 9 deletions(-)

diff --git a/src/backend/utils/mmgr/aset.c b/src/backend/utils/mmgr/aset.c
index 666ecd8f78d..21490213842 100644
--- a/src/backend/utils/mmgr/aset.c
+++ b/src/backend/utils/mmgr/aset.c
@@ -103,6 +103,8 @@
 
 #define ALLOC_BLOCKHDRSZ	MAXALIGN(sizeof(AllocBlockData))
 #define ALLOC_CHUNKHDRSZ	sizeof(MemoryChunk)
+#define FIRST_BLOCKHDRSZ	(MAXALIGN(sizeof(AllocSetContext)) + \
+							 ALLOC_BLOCKHDRSZ)
 
 typedef struct AllocBlockData *AllocBlock;	/* forward reference */
 
@@ -458,6 +460,21 @@ AllocSetContextCreateInternal(MemoryContext parent,
 	 * we'd leak the header/initial block if we ereport in this stretch.
 	 */
 
+	/* Create a Valgrind "pool" associated with the context */
+	VALGRIND_CREATE_MEMPOOL(set, 0, false);
+
+	/*
+	 * Create a Valgrind "chunk" covering both the AllocSetContext struct and
+	 * the keeper block's header.  (It would be more sensible for these to be
+	 * two separate chunks, but doing that seems to tickle bugs in some
+	 * versions of Valgrind.)  We must have these chunks, and also a chunk for
+	 * each subsequently-added block header, so that Valgrind considers the
+	 * pointers within them while checking for leaked memory.  Note that
+	 * Valgrind doesn't distinguish between these chunks and those created by
+	 * mcxt.c for the user-accessible-data chunks we allocate.
+	 */
+	VALGRIND_MEMPOOL_ALLOC(set, set, FIRST_BLOCKHDRSZ);
+
 	/* Fill in the initial block's block header */
 	block = KeeperBlock(set);
 	block->aset = set;
@@ -585,6 +602,14 @@ AllocSetReset(MemoryContext context)
 #ifdef CLOBBER_FREED_MEMORY
 			wipe_mem(block, block->freeptr - ((char *) block));
 #endif
+
+			/*
+			 * We need to free Valgrind's block-header chunks explicitly,
+			 * although the user-data chunks within will go away in the TRIM
+			 * below.  Otherwise Valgrind complains about leaked allocations.
+			 */
+			VALGRIND_MEMPOOL_FREE(set, block);
+
 			free(block);
 		}
 		block = next;
@@ -592,6 +617,14 @@ AllocSetReset(MemoryContext context)
 
 	Assert(context->mem_allocated == keepersize);
 
+	/*
+	 * Instruct Valgrind to throw away all the chunks associated with this
+	 * context, except for the one covering the AllocSetContext and
+	 * keeper-block header.  This gets rid of the chunks for whatever user
+	 * data is getting discarded by the context reset.
+	 */
+	VALGRIND_MEMPOOL_TRIM(set, set, FIRST_BLOCKHDRSZ);
+
 	/* Reset block size allocation sequence, too */
 	set->nextBlockSize = set->initBlockSize;
 }
@@ -648,6 +681,9 @@ AllocSetDelete(MemoryContext context)
 				freelist->first_free = (AllocSetContext *) oldset->header.nextchild;
 				freelist->num_free--;
 
+				/* Destroy the context's Valgrind pool --- see notes below */
+				VALGRIND_DESTROY_MEMPOOL(oldset);
+
 				/* All that remains is to free the header/initial block */
 				free(oldset);
 			}
@@ -675,13 +711,24 @@ AllocSetDelete(MemoryContext context)
 #endif
 
 		if (!IsKeeperBlock(set, block))
+		{
+			/* As in AllocSetReset, free block-header chunks explicitly */
+			VALGRIND_MEMPOOL_FREE(set, block);
 			free(block);
+		}
 
 		block = next;
 	}
 
 	Assert(context->mem_allocated == keepersize);
 
+	/*
+	 * Destroy the Valgrind pool.  We don't seem to need to explicitly free
+	 * the initial block's header chunk, nor any user-data chunks that
+	 * Valgrind still knows about.
+	 */
+	VALGRIND_DESTROY_MEMPOOL(set);
+
 	/* Finally, free the context header, including the keeper block */
 	free(set);
 }
@@ -716,6 +763,9 @@ AllocSetAllocLarge(MemoryContext context, Size size, int flags)
 	if (block == NULL)
 		return MemoryContextAllocationFailure(context, size, flags);
 
+	/* Make a Valgrind chunk covering the new block's header */
+	VALGRIND_MEMPOOL_ALLOC(set, block, ALLOC_BLOCKHDRSZ);
+
 	context->mem_allocated += blksize;
 
 	block->aset = set;
@@ -922,6 +972,9 @@ AllocSetAllocFromNewBlock(MemoryContext context, Size size, int flags,
 	if (block == NULL)
 		return MemoryContextAllocationFailure(context, size, flags);
 
+	/* Make a Valgrind chunk covering the new block's header */
+	VALGRIND_MEMPOOL_ALLOC(set, block, ALLOC_BLOCKHDRSZ);
+
 	context->mem_allocated += blksize;
 
 	block->aset = set;
@@ -1104,6 +1157,10 @@ AllocSetFree(void *pointer)
 #ifdef CLOBBER_FREED_MEMORY
 		wipe_mem(block, block->freeptr - ((char *) block));
 #endif
+
+		/* As in AllocSetReset, free block-header chunks explicitly */
+		VALGRIND_MEMPOOL_FREE(set, block);
+
 		free(block);
 	}
 	else
@@ -1184,6 +1241,7 @@ AllocSetRealloc(void *pointer, Size size, int flags)
 		 * realloc() to make the containing block bigger, or smaller, with
 		 * minimum space wastage.
 		 */
+		AllocBlock	newblock;
 		Size		chksize;
 		Size		blksize;
 		Size		oldblksize;
@@ -1223,14 +1281,21 @@ AllocSetRealloc(void *pointer, Size size, int flags)
 		blksize = chksize + ALLOC_BLOCKHDRSZ + ALLOC_CHUNKHDRSZ;
 		oldblksize = block->endptr - ((char *) block);
 
-		block = (AllocBlock) realloc(block, blksize);
-		if (block == NULL)
+		newblock = (AllocBlock) realloc(block, blksize);
+		if (newblock == NULL)
 		{
 			/* Disallow access to the chunk header. */
 			VALGRIND_MAKE_MEM_NOACCESS(chunk, ALLOC_CHUNKHDRSZ);
 			return MemoryContextAllocationFailure(&set->header, size, flags);
 		}
 
+		/*
+		 * Move Valgrind's block-header chunk explicitly.  (mcxt.c will take
+		 * care of moving the chunk for the user data.)
+		 */
+		VALGRIND_MEMPOOL_CHANGE(set, block, newblock, ALLOC_BLOCKHDRSZ);
+		block = newblock;
+
 		/* updated separately, not to underflow when (oldblksize > blksize) */
 		set->header.mem_allocated -= oldblksize;
 		set->header.mem_allocated += blksize;
diff --git a/src/backend/utils/mmgr/bump.c b/src/backend/utils/mmgr/bump.c
index f7a37d1b3e8..271bf78c583 100644
--- a/src/backend/utils/mmgr/bump.c
+++ b/src/backend/utils/mmgr/bump.c
@@ -45,7 +45,9 @@
 #include "utils/memutils_memorychunk.h"
 #include "utils/memutils_internal.h"
 
-#define Bump_BLOCKHDRSZ	MAXALIGN(sizeof(BumpBlock))
+#define Bump_BLOCKHDRSZ		MAXALIGN(sizeof(BumpBlock))
+#define FIRST_BLOCKHDRSZ	(MAXALIGN(sizeof(BumpContext)) + \
+							 Bump_BLOCKHDRSZ)
 
 /* No chunk header unless built with MEMORY_CONTEXT_CHECKING */
 #ifdef MEMORY_CONTEXT_CHECKING
@@ -189,6 +191,12 @@ BumpContextCreate(MemoryContext parent, const char *name, Size minContextSize,
 	 * Avoid writing code that can fail between here and MemoryContextCreate;
 	 * we'd leak the header and initial block if we ereport in this stretch.
 	 */
+
+	/* See comments about Valgrind interactions in aset.c */
+	VALGRIND_CREATE_MEMPOOL(set, 0, false);
+	/* This chunk covers the BumpContext and the keeper block header */
+	VALGRIND_MEMPOOL_ALLOC(set, set, FIRST_BLOCKHDRSZ);
+
 	dlist_init(&set->blocks);
 
 	/* Fill in the initial block's block header */
@@ -262,6 +270,14 @@ BumpReset(MemoryContext context)
 			BumpBlockFree(set, block);
 	}
 
+	/*
+	 * Instruct Valgrind to throw away all the chunks associated with this
+	 * context, except for the one covering the BumpContext and keeper-block
+	 * header.  This gets rid of the chunks for whatever user data is getting
+	 * discarded by the context reset.
+	 */
+	VALGRIND_MEMPOOL_TRIM(set, set, FIRST_BLOCKHDRSZ);
+
 	/* Reset block size allocation sequence, too */
 	set->nextBlockSize = set->initBlockSize;
 
@@ -279,6 +295,10 @@ BumpDelete(MemoryContext context)
 {
 	/* Reset to release all releasable BumpBlocks */
 	BumpReset(context);
+
+	/* Destroy the Valgrind pool -- see notes in aset.c */
+	VALGRIND_DESTROY_MEMPOOL(context);
+
 	/* And free the context header and keeper block */
 	free(context);
 }
@@ -318,6 +338,9 @@ BumpAllocLarge(MemoryContext context, Size size, int flags)
 	if (block == NULL)
 		return MemoryContextAllocationFailure(context, size, flags);
 
+	/* Make a Valgrind chunk covering the new block's header */
+	VALGRIND_MEMPOOL_ALLOC(set, block, Bump_BLOCKHDRSZ);
+
 	context->mem_allocated += blksize;
 
 	/* the block is completely full */
@@ -455,6 +478,9 @@ BumpAllocFromNewBlock(MemoryContext context, Size size, int flags,
 	if (block == NULL)
 		return MemoryContextAllocationFailure(context, size, flags);
 
+	/* Make a Valgrind chunk covering the new block's header */
+	VALGRIND_MEMPOOL_ALLOC(set, block, Bump_BLOCKHDRSZ);
+
 	context->mem_allocated += blksize;
 
 	/* initialize the new block */
@@ -606,6 +632,9 @@ BumpBlockFree(BumpContext *set, BumpBlock *block)
 	wipe_mem(block, ((char *) block->endptr - (char *) block));
 #endif
 
+	/* As in aset.c, free block-header chunks explicitly */
+	VALGRIND_MEMPOOL_FREE(set, block);
+
 	free(block);
 }
 
diff --git a/src/backend/utils/mmgr/generation.c b/src/backend/utils/mmgr/generation.c
index 18679ad4f1e..e476df534d7 100644
--- a/src/backend/utils/mmgr/generation.c
+++ b/src/backend/utils/mmgr/generation.c
@@ -45,6 +45,8 @@
 
 #define Generation_BLOCKHDRSZ	MAXALIGN(sizeof(GenerationBlock))
 #define Generation_CHUNKHDRSZ	sizeof(MemoryChunk)
+#define FIRST_BLOCKHDRSZ		(MAXALIGN(sizeof(GenerationContext)) + \
+								 Generation_BLOCKHDRSZ)
 
 #define Generation_CHUNK_FRACTION	8
 
@@ -221,6 +223,12 @@ GenerationContextCreate(MemoryContext parent,
 	 * Avoid writing code that can fail between here and MemoryContextCreate;
 	 * we'd leak the header if we ereport in this stretch.
 	 */
+
+	/* See comments about Valgrind interactions in aset.c */
+	VALGRIND_CREATE_MEMPOOL(set, 0, false);
+	/* This chunk covers the GenerationContext and the keeper block header */
+	VALGRIND_MEMPOOL_ALLOC(set, set, FIRST_BLOCKHDRSZ);
+
 	dlist_init(&set->blocks);
 
 	/* Fill in the initial block's block header */
@@ -309,6 +317,14 @@ GenerationReset(MemoryContext context)
 			GenerationBlockFree(set, block);
 	}
 
+	/*
+	 * Instruct Valgrind to throw away all the chunks associated with this
+	 * context, except for the one covering the GenerationContext and
+	 * keeper-block header.  This gets rid of the chunks for whatever user
+	 * data is getting discarded by the context reset.
+	 */
+	VALGRIND_MEMPOOL_TRIM(set, set, FIRST_BLOCKHDRSZ);
+
 	/* set it so new allocations to make use of the keeper block */
 	set->block = KeeperBlock(set);
 
@@ -329,6 +345,10 @@ GenerationDelete(MemoryContext context)
 {
 	/* Reset to release all releasable GenerationBlocks */
 	GenerationReset(context);
+
+	/* Destroy the Valgrind pool -- see notes in aset.c */
+	VALGRIND_DESTROY_MEMPOOL(context);
+
 	/* And free the context header and keeper block */
 	free(context);
 }
@@ -365,6 +385,9 @@ GenerationAllocLarge(MemoryContext context, Size size, int flags)
 	if (block == NULL)
 		return MemoryContextAllocationFailure(context, size, flags);
 
+	/* Make a Valgrind chunk covering the new block's header */
+	VALGRIND_MEMPOOL_ALLOC(set, block, Generation_BLOCKHDRSZ);
+
 	context->mem_allocated += blksize;
 
 	/* block with a single (used) chunk */
@@ -487,6 +510,9 @@ GenerationAllocFromNewBlock(MemoryContext context, Size size, int flags,
 	if (block == NULL)
 		return MemoryContextAllocationFailure(context, size, flags);
 
+	/* Make a Valgrind chunk covering the new block's header */
+	VALGRIND_MEMPOOL_ALLOC(set, block, Generation_BLOCKHDRSZ);
+
 	context->mem_allocated += blksize;
 
 	/* initialize the new block */
@@ -677,6 +703,9 @@ GenerationBlockFree(GenerationContext *set, GenerationBlock *block)
 	wipe_mem(block, block->blksize);
 #endif
 
+	/* As in aset.c, free block-header chunks explicitly */
+	VALGRIND_MEMPOOL_FREE(set, block);
+
 	free(block);
 }
 
diff --git a/src/backend/utils/mmgr/mcxt.c b/src/backend/utils/mmgr/mcxt.c
index 7d28ca706eb..6fce916d11e 100644
--- a/src/backend/utils/mmgr/mcxt.c
+++ b/src/backend/utils/mmgr/mcxt.c
@@ -8,6 +8,15 @@
  * context-type-specific operations via the function pointers in a
  * context's MemoryContextMethods struct.
  *
+ * A note about Valgrind support: the context-type-specific code is
+ * responsible for creating and deleting a Valgrind "pool" for each context,
+ * and also for creating Valgrind "chunks" to cover its management data
+ * structures such as block headers.  (There must be a chunk that includes
+ * every pointer we want Valgrind to consider for leak-tracking purposes.)
+ * This module creates and deletes chunks that cover the caller-visible
+ * allocated areas.  However, the context-type-specific code must handle
+ * cleaning up caller-visible chunks during memory context reset operations.
+ *
  *
  * Portions Copyright (c) 1996-2025, PostgreSQL Global Development Group
  * Portions Copyright (c) 1994, Regents of the University of California
@@ -449,8 +458,6 @@ MemoryContextResetOnly(MemoryContext context)
 
 		context->methods->reset(context);
 		context->isReset = true;
-		VALGRIND_DESTROY_MEMPOOL(context);
-		VALGRIND_CREATE_MEMPOOL(context, 0, false);
 	}
 }
 
@@ -557,8 +564,6 @@ MemoryContextDeleteOnly(MemoryContext context)
 	context->ident = NULL;
 
 	context->methods->delete_context(context);
-
-	VALGRIND_DESTROY_MEMPOOL(context);
 }
 
 /*
@@ -1212,8 +1217,6 @@ MemoryContextCreate(MemoryContext node,
 		node->nextchild = NULL;
 		node->allowInCritSection = false;
 	}
-
-	VALGRIND_CREATE_MEMPOOL(node, 0, false);
 }
 
 /*
diff --git a/src/backend/utils/mmgr/slab.c b/src/backend/utils/mmgr/slab.c
index d32c0d318fb..563ac728db6 100644
--- a/src/backend/utils/mmgr/slab.c
+++ b/src/backend/utils/mmgr/slab.c
@@ -377,6 +377,11 @@ SlabContextCreate(MemoryContext parent,
 	 * we'd leak the header if we ereport in this stretch.
 	 */
 
+	/* See comments about Valgrind interactions in aset.c */
+	VALGRIND_CREATE_MEMPOOL(slab, 0, false);
+	/* This chunk covers the SlabContext only */
+	VALGRIND_MEMPOOL_ALLOC(slab, slab, sizeof(SlabContext));
+
 	/* Fill in SlabContext-specific header fields */
 	slab->chunkSize = (uint32) chunkSize;
 	slab->fullChunkSize = (uint32) fullChunkSize;
@@ -451,6 +456,10 @@ SlabReset(MemoryContext context)
 #ifdef CLOBBER_FREED_MEMORY
 		wipe_mem(block, slab->blockSize);
 #endif
+
+		/* As in aset.c, free block-header chunks explicitly */
+		VALGRIND_MEMPOOL_FREE(slab, block);
+
 		free(block);
 		context->mem_allocated -= slab->blockSize;
 	}
@@ -467,11 +476,23 @@ SlabReset(MemoryContext context)
 #ifdef CLOBBER_FREED_MEMORY
 			wipe_mem(block, slab->blockSize);
 #endif
+
+			/* As in aset.c, free block-header chunks explicitly */
+			VALGRIND_MEMPOOL_FREE(slab, block);
+
 			free(block);
 			context->mem_allocated -= slab->blockSize;
 		}
 	}
 
+	/*
+	 * Instruct Valgrind to throw away all the chunks associated with this
+	 * context, except for the one covering the SlabContext.  This gets rid of
+	 * the chunks for whatever user data is getting discarded by the context
+	 * reset.
+	 */
+	VALGRIND_MEMPOOL_TRIM(slab, slab, sizeof(SlabContext));
+
 	slab->curBlocklistIndex = 0;
 
 	Assert(context->mem_allocated == 0);
@@ -486,6 +507,10 @@ SlabDelete(MemoryContext context)
 {
 	/* Reset to release all the SlabBlocks */
 	SlabReset(context);
+
+	/* Destroy the Valgrind pool -- see notes in aset.c */
+	VALGRIND_DESTROY_MEMPOOL(context);
+
 	/* And free the context header */
 	free(context);
 }
@@ -567,6 +592,9 @@ SlabAllocFromNewBlock(MemoryContext context, Size size, int flags)
 		if (unlikely(block == NULL))
 			return MemoryContextAllocationFailure(context, size, flags);
 
+		/* Make a Valgrind chunk covering the new block's header */
+		VALGRIND_MEMPOOL_ALLOC(slab, block, Slab_BLOCKHDRSZ);
+
 		block->slab = slab;
 		context->mem_allocated += slab->blockSize;
 
@@ -795,6 +823,10 @@ SlabFree(void *pointer)
 #ifdef CLOBBER_FREED_MEMORY
 			wipe_mem(block, slab->blockSize);
 #endif
+
+			/* As in aset.c, free block-header chunks explicitly */
+			VALGRIND_MEMPOOL_FREE(slab, block);
+
 			free(block);
 			slab->header.mem_allocated -= slab->blockSize;
 		}
diff --git a/src/include/utils/memdebug.h b/src/include/utils/memdebug.h
index 7309271834b..80692dcef93 100644
--- a/src/include/utils/memdebug.h
+++ b/src/include/utils/memdebug.h
@@ -29,6 +29,7 @@
 #define VALGRIND_MEMPOOL_ALLOC(context, addr, size)			do {} while (0)
 #define VALGRIND_MEMPOOL_FREE(context, addr)				do {} while (0)
 #define VALGRIND_MEMPOOL_CHANGE(context, optr, nptr, size)	do {} while (0)
+#define VALGRIND_MEMPOOL_TRIM(context, addr, size)			do {} while (0)
 #endif
 
 
-- 
2.43.5

From fb80149e0517b33dd265ca5b1f559b0f60405a5a Mon Sep 17 00:00:00 2001
From: Tom Lane <t...@sss.pgh.pa.us>
Date: Sun, 11 May 2025 12:52:32 -0400
Subject: [PATCH v1 02/11] Temporarily suppress some Valgrind complaints.

This isn't meant for commit, but it is useful to reduce the large
amount of leak-check noise generated by a couple of known problems:

Temporary buffers, as well as some other structures, are allocated in
a way that causes us not to hold any pointer to what Valgrind thinks
is the start of the chunk.  That causes "possible leak" reports.

PL/pgSQL and SQL-function parsing leak stuff into the long-lived
function cache context.  This isn't really a huge problem, since
the cruft will be recovered if we have to re-parse the function.
But it leads to a lot of complaints from Valgrind.

We need better answers for these things, but for the moment hide
them to allow sorting through other leakage problems.

Author: Tom Lane <t...@sss.pgh.pa.us>
Discussion: https://postgr.es/m/285483.1746756...@sss.pgh.pa.us
---
 src/tools/valgrind.supp | 42 +++++++++++++++++++++++++++++++++++++++++
 1 file changed, 42 insertions(+)

diff --git a/src/tools/valgrind.supp b/src/tools/valgrind.supp
index 7ea464c8094..7d4fb0b2c1d 100644
--- a/src/tools/valgrind.supp
+++ b/src/tools/valgrind.supp
@@ -180,3 +180,45 @@
    Memcheck:Cond
    fun:PyObject_Realloc
 }
+
+# Suppress complaints about aligned allocations of temp buffers.
+# This ought to be fixed properly, instead.
+{
+   hide_temp_buffer_allocations
+   Memcheck:Leak
+   match-leak-kinds: possible
+   fun:MemoryContextAlloc
+   fun:GetLocalBufferStorage
+}
+
+# Suppress complaints about aligned allocations of other stuff.
+# This ought to be fixed properly, instead.
+{
+   hide_alignment_leakage
+   Memcheck:Leak
+   match-leak-kinds: possible
+   fun:MemoryContextAllocExtended
+   fun:MemoryContextAllocAligned
+}
+
+# Suppress complaints about stuff leaked by PL/pgSQL parsing.
+# This ought to be fixed properly, instead.
+{
+   hide_plpgsql_parsing_leaks
+   Memcheck:Leak
+   match-leak-kinds: definite,possible,indirect
+
+   ...
+   fun:plpgsql_compile_callback
+}
+
+# And likewise for stuff leaked by SQL-function parsing.
+# This ought to be fixed properly, instead.
+{
+   hide_sql_parsing_leaks
+   Memcheck:Leak
+   match-leak-kinds: definite,possible,indirect
+
+   ...
+   fun:sql_compile_callback
+}
-- 
2.43.5

From 2f7d138a0a222e9f598a6f54c233c2cc7f6391c9 Mon Sep 17 00:00:00 2001
From: Tom Lane <t...@sss.pgh.pa.us>
Date: Sun, 11 May 2025 12:56:23 -0400
Subject: [PATCH v1 03/11] Silence complaints about leaked dynahash storage.

Because dynahash.c never frees hashtable storage except by deleting
the whole hashtable context, it doesn't bother to track the individual
blocks of elements allocated by element_alloc().  This results in
"possibly lost" complaints from Valgrind except when the first element
of each block is actively in use.  (Otherwise it'll be on a freelist,
but very likely only reachable via "interior pointers" within element
blocks, which doesn't satisfy Valgrind.)

To fix, if we're building with USE_VALGRIND, expend an extra pointer's
worth of space in each element block so that we can chain them all
together from the HTAB header.  Skip this in shared hashtables though:
Valgrind doesn't track those, and we'd need additional locking to make
it safe to manipulate a shared chain.

While here, update a comment obsoleted by 9c911ec06.

Author: Tom Lane <t...@sss.pgh.pa.us>
Discussion: https://postgr.es/m/285483.1746756...@sss.pgh.pa.us
---
 src/backend/utils/hash/dynahash.c | 52 +++++++++++++++++++++++++++----
 1 file changed, 46 insertions(+), 6 deletions(-)

diff --git a/src/backend/utils/hash/dynahash.c b/src/backend/utils/hash/dynahash.c
index 1ad155d446e..f81f9c199e4 100644
--- a/src/backend/utils/hash/dynahash.c
+++ b/src/backend/utils/hash/dynahash.c
@@ -22,10 +22,11 @@
  * lookup key's hash value as a partition number --- this will work because
  * of the way calc_bucket() maps hash values to bucket numbers.
  *
- * For hash tables in shared memory, the memory allocator function should
- * match malloc's semantics of returning NULL on failure.  For hash tables
- * in local memory, we typically use palloc() which will throw error on
- * failure.  The code in this file has to cope with both cases.
+ * The memory allocator function should match malloc's semantics of returning
+ * NULL on failure.  (This is essential for hash tables in shared memory.
+ * For hash tables in local memory, we used to use palloc() which will throw
+ * error on failure; but we no longer do, so it's untested whether this
+ * module will still cope with that behavior.)
  *
  * dynahash.c provides support for these types of lookup keys:
  *
@@ -98,6 +99,7 @@
 
 #include "access/xact.h"
 #include "common/hashfn.h"
+#include "lib/ilist.h"
 #include "port/pg_bitutils.h"
 #include "storage/shmem.h"
 #include "storage/spin.h"
@@ -236,6 +238,16 @@ struct HTAB
 	Size		keysize;		/* hash key length in bytes */
 	long		ssize;			/* segment size --- must be power of 2 */
 	int			sshift;			/* segment shift = log2(ssize) */
+
+	/*
+	 * In a USE_VALGRIND build, non-shared hashtables keep an slist chain of
+	 * all the element blocks they have allocated.  This pacifies Valgrind,
+	 * which would otherwise often claim that the element blocks are "possibly
+	 * lost" for lack of any non-interior pointers to their starts.
+	 */
+#ifdef USE_VALGRIND
+	slist_head	element_blocks;
+#endif
 };
 
 /*
@@ -1708,6 +1720,8 @@ element_alloc(HTAB *hashp, int nelem, int freelist_idx)
 {
 	HASHHDR    *hctl = hashp->hctl;
 	Size		elementSize;
+	Size		requestSize;
+	char	   *allocedBlock;
 	HASHELEMENT *firstElement;
 	HASHELEMENT *tmpElement;
 	HASHELEMENT *prevElement;
@@ -1719,12 +1733,38 @@ element_alloc(HTAB *hashp, int nelem, int freelist_idx)
 	/* Each element has a HASHELEMENT header plus user data. */
 	elementSize = MAXALIGN(sizeof(HASHELEMENT)) + MAXALIGN(hctl->entrysize);
 
+	requestSize = nelem * elementSize;
+
+	/* Add space for slist_node list link if we need one. */
+#ifdef USE_VALGRIND
+	if (!hashp->isshared)
+		requestSize += MAXALIGN(sizeof(slist_node));
+#endif
+
+	/* Allocate the memory. */
 	CurrentDynaHashCxt = hashp->hcxt;
-	firstElement = (HASHELEMENT *) hashp->alloc(nelem * elementSize);
+	allocedBlock = hashp->alloc(requestSize);
 
-	if (!firstElement)
+	if (!allocedBlock)
 		return false;
 
+	/*
+	 * If USE_VALGRIND, each allocated block of elements of a non-shared
+	 * hashtable is chained into a list, so that Valgrind won't think it's
+	 * been leaked.
+	 */
+#ifdef USE_VALGRIND
+	if (hashp->isshared)
+		firstElement = (HASHELEMENT *) allocedBlock;
+	else
+	{
+		slist_push_head(&hashp->element_blocks, (slist_node *) allocedBlock);
+		firstElement = (HASHELEMENT *) (allocedBlock + MAXALIGN(sizeof(slist_node)));
+	}
+#else
+	firstElement = (HASHELEMENT *) allocedBlock;
+#endif
+
 	/* prepare to link all the new entries into the freelist */
 	prevElement = NULL;
 	tmpElement = firstElement;
-- 
2.43.5

From 8f3915908e39ece6572198decb16d91a76de6947 Mon Sep 17 00:00:00 2001
From: Tom Lane <t...@sss.pgh.pa.us>
Date: Sun, 11 May 2025 13:09:53 -0400
Subject: [PATCH v1 04/11] Silence complaints about leaked catcache entries.

Put the dlist_node fields of catctup and catclist structs first.
This ensures that the dlist pointers point to the starts of these
palloc blocks, and thus that Valgrind won't consider them
"possibly lost".

This is a hack, since it implies a similar requirement for every other
struct that we chain into slists or dlists (unless there are other
pointers to them, as there are for the CatCache structs).  But there
is no other obvious alternative.  Fixing these two places seems to be
enough to silence all such Valgrind complaints in simple test cases.

Author: Tom Lane <t...@sss.pgh.pa.us>
Discussion: https://postgr.es/m/285483.1746756...@sss.pgh.pa.us
---
 src/include/utils/catcache.h | 23 ++++++++++++++---------
 1 file changed, 14 insertions(+), 9 deletions(-)

diff --git a/src/include/utils/catcache.h b/src/include/utils/catcache.h
index 277ec33c00b..00808e23f49 100644
--- a/src/include/utils/catcache.h
+++ b/src/include/utils/catcache.h
@@ -87,6 +87,14 @@ typedef struct catcache
 
 typedef struct catctup
 {
+	/*
+	 * Each tuple in a cache is a member of a dlist that stores the elements
+	 * of its hash bucket.  We keep each dlist in LRU order to speed repeated
+	 * lookups.  Keep the dlist_node field first so that Valgrind understands
+	 * the struct is reachable.
+	 */
+	dlist_node	cache_elem;		/* list member of per-bucket list */
+
 	int			ct_magic;		/* for identifying CatCTup entries */
 #define CT_MAGIC   0x57261502
 
@@ -98,13 +106,6 @@ typedef struct catctup
 	 */
 	Datum		keys[CATCACHE_MAXKEYS];
 
-	/*
-	 * Each tuple in a cache is a member of a dlist that stores the elements
-	 * of its hash bucket.  We keep each dlist in LRU order to speed repeated
-	 * lookups.
-	 */
-	dlist_node	cache_elem;		/* list member of per-bucket list */
-
 	/*
 	 * A tuple marked "dead" must not be returned by subsequent searches.
 	 * However, it won't be physically deleted from the cache until its
@@ -158,13 +159,17 @@ typedef struct catctup
  */
 typedef struct catclist
 {
+	/*
+	 * Keep the dlist_node field first so that Valgrind understands the struct
+	 * is reachable.
+	 */
+	dlist_node	cache_elem;		/* list member of per-catcache list */
+
 	int			cl_magic;		/* for identifying CatCList entries */
 #define CL_MAGIC   0x52765103
 
 	uint32		hash_value;		/* hash value for lookup keys */
 
-	dlist_node	cache_elem;		/* list member of per-catcache list */
-
 	/*
 	 * Lookup keys for the entry, with the first nkeys elements being valid.
 	 * All by-reference are separately allocated.
-- 
2.43.5

From 206cff7a169626dc32c411751f472c2147cd7a98 Mon Sep 17 00:00:00 2001
From: Tom Lane <t...@sss.pgh.pa.us>
Date: Sun, 11 May 2025 13:14:52 -0400
Subject: [PATCH v1 05/11] Silence complaints about leaked CatCache structs.

In USE_VALGRIND builds, don't attempt to cache-align CatCache
structs.  That prevents them from being reported as
"possibly lost" because all the pointers to them look like
interior pointers to Valgrind.

This is a hack, not meant for final commit.  The same hazard exists
for every other use of MemoryContextAllocAligned, so we really ought
to fix it there instead of touching this code.  But this seems to be
enough to silence all such Valgrind complaints in simple test cases,
so just do this much for now.

Author: Tom Lane <t...@sss.pgh.pa.us>
Discussion: https://postgr.es/m/285483.1746756...@sss.pgh.pa.us
---
 src/backend/utils/cache/catcache.c | 9 ++++++++-
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/src/backend/utils/cache/catcache.c b/src/backend/utils/cache/catcache.c
index 657648996c2..9e82d9fec68 100644
--- a/src/backend/utils/cache/catcache.c
+++ b/src/backend/utils/cache/catcache.c
@@ -923,12 +923,19 @@ InitCatCache(int id,
 	}
 
 	/*
-	 * Allocate a new cache structure, aligning to a cacheline boundary
+	 * Allocate a new cache structure.  Normally we align it to a cacheline
+	 * boundary, but skip that in USE_VALGRIND builds because it will confuse
+	 * Valgrind.  (XXX really we should fix that problem in
+	 * MemoryContextAllocAligned, not here.)
 	 *
 	 * Note: we rely on zeroing to initialize all the dlist headers correctly
 	 */
+#ifndef USE_VALGRIND
 	cp = (CatCache *) palloc_aligned(sizeof(CatCache), PG_CACHE_LINE_SIZE,
 									 MCXT_ALLOC_ZERO);
+#else
+	cp = (CatCache *) palloc0(sizeof(CatCache));
+#endif
 	cp->cc_bucket = palloc0(nbuckets * sizeof(dlist_head));
 
 	/*
-- 
2.43.5

From 6dbb1a4c32f394a7dc63073027bc942a5c1798a3 Mon Sep 17 00:00:00 2001
From: Tom Lane <t...@sss.pgh.pa.us>
Date: Sun, 11 May 2025 13:19:35 -0400
Subject: [PATCH v1 06/11] Silence complaints about save_ps_display_args.

Valgrind seems not to consider the global "environ" variable as a
valid root pointer; so when we allocate a new environment array,
it claims that data is leaked.  To fix that, keep our own
statically-allocated copy of the pointer.

Author: Tom Lane <t...@sss.pgh.pa.us>
Discussion: https://postgr.es/m/285483.1746756...@sss.pgh.pa.us
---
 src/backend/utils/misc/ps_status.c | 16 ++++++++++++++++
 1 file changed, 16 insertions(+)

diff --git a/src/backend/utils/misc/ps_status.c b/src/backend/utils/misc/ps_status.c
index e08b26e8c14..4df25944deb 100644
--- a/src/backend/utils/misc/ps_status.c
+++ b/src/backend/utils/misc/ps_status.c
@@ -100,6 +100,17 @@ static void flush_ps_display(void);
 static int	save_argc;
 static char **save_argv;
 
+/*
+ * Valgrind seems not to consider the global "environ" variable as a valid
+ * root pointer; so when we allocate a new environment array, it claims that
+ * data is leaked.  To fix that, keep our own statically-allocated copy of the
+ * pointer.  (Oddly, this doesn't seem to be a problem for "argv".)
+ */
+#if defined(PS_USE_CLOBBER_ARGV) && defined(USE_VALGRIND)
+extern char **ps_status_new_environ;
+char	  **ps_status_new_environ;
+#endif
+
 
 /*
  * Call this early in startup to save the original argc/argv values.
@@ -206,6 +217,11 @@ save_ps_display_args(int argc, char **argv)
 		}
 		new_environ[i] = NULL;
 		environ = new_environ;
+
+		/* See notes about Valgrind above. */
+#ifdef USE_VALGRIND
+		ps_status_new_environ = new_environ;
+#endif
 	}
 
 	/*
-- 
2.43.5

From baf462f1854ffc8a38b0c4bdabb4575e36cea0f6 Mon Sep 17 00:00:00 2001
From: Tom Lane <t...@sss.pgh.pa.us>
Date: Sun, 11 May 2025 13:20:59 -0400
Subject: [PATCH v1 07/11] Don't leak the startup-packet buffer in
 ProcessStartupPacket.

This is the first actual code bug fix in this patch series.

I only bothered to free the buffer in the successful-exit code
paths, not the error-exit ones, since we'd quit soon anyway in
the latter cases.

Author: Tom Lane <t...@sss.pgh.pa.us>
Discussion: https://postgr.es/m/285483.1746756...@sss.pgh.pa.us
---
 src/backend/tcop/backend_startup.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/src/backend/tcop/backend_startup.c b/src/backend/tcop/backend_startup.c
index a7d1fec981f..0b956627934 100644
--- a/src/backend/tcop/backend_startup.c
+++ b/src/backend/tcop/backend_startup.c
@@ -627,6 +627,8 @@ ProcessStartupPacket(Port *port, bool ssl_done, bool gss_done)
 					 errmsg("received unencrypted data after SSL request"),
 					 errdetail("This could be either a client-software bug or evidence of an attempted man-in-the-middle attack.")));
 
+		pfree(buf);
+
 		/*
 		 * regular startup packet, cancel, etc packet should follow, but not
 		 * another SSL negotiation request, and a GSS request should only
@@ -681,6 +683,8 @@ ProcessStartupPacket(Port *port, bool ssl_done, bool gss_done)
 					 errmsg("received unencrypted data after GSSAPI encryption request"),
 					 errdetail("This could be either a client-software bug or evidence of an attempted man-in-the-middle attack.")));
 
+		pfree(buf);
+
 		/*
 		 * regular startup packet, cancel, etc packet should follow, but not
 		 * another GSS negotiation request, and an SSL request should only
@@ -858,6 +862,8 @@ ProcessStartupPacket(Port *port, bool ssl_done, bool gss_done)
 	if (am_walsender && !am_db_walsender)
 		port->database_name[0] = '\0';
 
+	pfree(buf);
+
 	/*
 	 * Done filling the Port structure
 	 */
-- 
2.43.5

From c8e124f6c6d209e39072e6c2859c476bce8c082e Mon Sep 17 00:00:00 2001
From: Tom Lane <t...@sss.pgh.pa.us>
Date: Sun, 11 May 2025 13:32:17 -0400
Subject: [PATCH v1 08/11] Fix some extremely broken code from 525392d57.

This uselessly makes two copies of the plan tree during
BuildCachedPlan; there should be only one, and it should be
in the stmt_context not plan_context.   Also, UpdateCachedPlan
has a fragile and undocumented assumption that the length of
the stmt_list can't change.  (The XXX comments in that function
have a very strong whiff of code that wasn't really ready to
commit, but this is enough fixing to silence Valgrind's complaints
about leaked data in the plan_context.)

Author: Tom Lane <t...@sss.pgh.pa.us>
Discussion: https://postgr.es/m/285483.1746756...@sss.pgh.pa.us
---
 src/backend/utils/cache/plancache.c | 16 +++++-----------
 1 file changed, 5 insertions(+), 11 deletions(-)

diff --git a/src/backend/utils/cache/plancache.c b/src/backend/utils/cache/plancache.c
index 9bcbc4c3e97..53e3255ad7f 100644
--- a/src/backend/utils/cache/plancache.c
+++ b/src/backend/utils/cache/plancache.c
@@ -1114,17 +1114,18 @@ BuildCachedPlan(CachedPlanSource *plansource, List *qlist,
 											 ALLOCSET_START_SMALL_SIZES);
 		MemoryContextCopyAndSetIdentifier(plan_context, plansource->query_string);
 
-		stmt_context = AllocSetContextCreate(CurrentMemoryContext,
+		stmt_context = AllocSetContextCreate(plan_context,
 											 "CachedPlan PlannedStmts",
 											 ALLOCSET_START_SMALL_SIZES);
 		MemoryContextCopyAndSetIdentifier(stmt_context, plansource->query_string);
-		MemoryContextSetParent(stmt_context, plan_context);
 
+		/*
+		 * Copy plans into the stmt_context.
+		 */
 		MemoryContextSwitchTo(stmt_context);
 		plist = copyObject(plist);
 
 		MemoryContextSwitchTo(plan_context);
-		plist = list_copy(plist);
 	}
 	else
 		plan_context = CurrentMemoryContext;
@@ -1207,8 +1208,6 @@ UpdateCachedPlan(CachedPlanSource *plansource, int query_index,
 {
 	List	   *query_list = plansource->query_list,
 			   *plan_list;
-	ListCell   *l1,
-			   *l2;
 	CachedPlan *plan = plansource->gplan;
 	MemoryContext oldcxt;
 
@@ -1260,12 +1259,7 @@ UpdateCachedPlan(CachedPlanSource *plansource, int query_index,
 
 	MemoryContextReset(plan->stmt_context);
 	oldcxt = MemoryContextSwitchTo(plan->stmt_context);
-	forboth(l1, plan_list, l2, plan->stmt_list)
-	{
-		PlannedStmt *plannedstmt = lfirst(l1);
-
-		lfirst(l2) = copyObject(plannedstmt);
-	}
+	plan->stmt_list = copyObject(plan_list);
 	MemoryContextSwitchTo(oldcxt);
 
 	/*
-- 
2.43.5

From f5b72b015066053ed3396f0ea010ae470911f3b6 Mon Sep 17 00:00:00 2001
From: Tom Lane <t...@sss.pgh.pa.us>
Date: Sun, 11 May 2025 13:38:31 -0400
Subject: [PATCH v1 09/11] Avoid per-relation leakage in autovacuum.

PgStat_StatTabEntry and AutoVacOpts structs were leaked until
the end of the autovacuum worker's run, which is bad news if
there are a lot of relations in the database.

Note: pfree'ing the PgStat_StatTabEntry structs here seems
a bit risky, because pgstat_fetch_stat_tabentry_ext does not
guarantee anything about whether its result is long-lived.
That API could use a re-think.

Also, free a couple of structures retail when USE_VALGRIND.
This doesn't have any effect on the process's total memory
consumption, but it does reduce noise in valgrind leakage
reports.

Author: Tom Lane <t...@sss.pgh.pa.us>
Discussion: https://postgr.es/m/285483.1746756...@sss.pgh.pa.us
---
 src/backend/postmaster/autovacuum.c | 47 ++++++++++++++++++++++++-----
 1 file changed, 40 insertions(+), 7 deletions(-)

diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index 4d4a1a3197e..ebe8af1056d 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -2077,6 +2077,12 @@ do_autovacuum(void)
 				}
 			}
 		}
+
+		/* Release stuff to avoid per-relation leakage */
+		if (relopts)
+			pfree(relopts);
+		if (tabentry)
+			pfree(tabentry);
 	}
 
 	table_endscan(relScan);
@@ -2093,7 +2099,8 @@ do_autovacuum(void)
 		Form_pg_class classForm = (Form_pg_class) GETSTRUCT(tuple);
 		PgStat_StatTabEntry *tabentry;
 		Oid			relid;
-		AutoVacOpts *relopts = NULL;
+		AutoVacOpts *relopts;
+		bool		free_relopts = false;
 		bool		dovacuum;
 		bool		doanalyze;
 		bool		wraparound;
@@ -2111,7 +2118,9 @@ do_autovacuum(void)
 		 * main rel
 		 */
 		relopts = extract_autovac_opts(tuple, pg_class_desc);
-		if (relopts == NULL)
+		if (relopts)
+			free_relopts = true;
+		else
 		{
 			av_relation *hentry;
 			bool		found;
@@ -2132,6 +2141,12 @@ do_autovacuum(void)
 		/* ignore analyze for toast tables */
 		if (dovacuum)
 			table_oids = lappend_oid(table_oids, relid);
+
+		/* Release stuff to avoid leakage */
+		if (free_relopts)
+			pfree(relopts);
+		if (tabentry)
+			pfree(tabentry);
 	}
 
 	table_endscan(relScan);
@@ -2503,6 +2518,8 @@ deleted:
 		pg_atomic_test_set_flag(&MyWorkerInfo->wi_dobalance);
 	}
 
+	list_free(table_oids);
+
 	/*
 	 * Perform additional work items, as requested by backends.
 	 */
@@ -2545,8 +2562,15 @@ deleted:
 
 	/*
 	 * We leak table_toast_map here (among other things), but since we're
-	 * going away soon, it's not a problem.
+	 * going away soon, it's not a problem normally.  But when using Valgrind,
+	 * release some stuff to reduce complaints about leaked storage.
 	 */
+#ifdef USE_VALGRIND
+	hash_destroy(table_toast_map);
+	FreeTupleDesc(pg_class_desc);
+	if (bstrategy)
+		pfree(bstrategy);
+#endif
 
 	/*
 	 * Update pg_database.datfrozenxid, and truncate pg_xact if possible. We
@@ -2684,8 +2708,8 @@ deleted2:
 /*
  * extract_autovac_opts
  *
- * Given a relation's pg_class tuple, return the AutoVacOpts portion of
- * reloptions, if set; otherwise, return NULL.
+ * Given a relation's pg_class tuple, return a palloc'd copy of the
+ * AutoVacOpts portion of reloptions, if set; otherwise, return NULL.
  *
  * Note: callers do not have a relation lock on the table at this point,
  * so the table could have been dropped, and its catalog rows gone, after
@@ -2734,6 +2758,7 @@ table_recheck_autovac(Oid relid, HTAB *table_toast_map,
 	autovac_table *tab = NULL;
 	bool		wraparound;
 	AutoVacOpts *avopts;
+	bool		free_avopts = false;
 
 	/* fetch the relation's relcache entry */
 	classTup = SearchSysCacheCopy1(RELOID, ObjectIdGetDatum(relid));
@@ -2746,8 +2771,10 @@ table_recheck_autovac(Oid relid, HTAB *table_toast_map,
 	 * main table reloptions if the toast table itself doesn't have.
 	 */
 	avopts = extract_autovac_opts(classTup, pg_class_desc);
-	if (classForm->relkind == RELKIND_TOASTVALUE &&
-		avopts == NULL && table_toast_map != NULL)
+	if (avopts)
+		free_avopts = true;
+	else if (classForm->relkind == RELKIND_TOASTVALUE &&
+			 table_toast_map != NULL)
 	{
 		av_relation *hentry;
 		bool		found;
@@ -2856,6 +2883,8 @@ table_recheck_autovac(Oid relid, HTAB *table_toast_map,
 						 avopts->vacuum_cost_delay >= 0));
 	}
 
+	if (free_avopts)
+		pfree(avopts);
 	heap_freetuple(classTup);
 	return tab;
 }
@@ -2887,6 +2916,10 @@ recheck_relation_needs_vacanalyze(Oid relid,
 							  effective_multixact_freeze_max_age,
 							  dovacuum, doanalyze, wraparound);
 
+	/* Release tabentry to avoid leakage */
+	if (tabentry)
+		pfree(tabentry);
+
 	/* ignore ANALYZE for toast tables */
 	if (classForm->relkind == RELKIND_TOASTVALUE)
 		*doanalyze = false;
-- 
2.43.5

From 9815596aaecaec254c638e1a150cac7860954f1c Mon Sep 17 00:00:00 2001
From: Tom Lane <t...@sss.pgh.pa.us>
Date: Sun, 11 May 2025 13:44:22 -0400
Subject: [PATCH v1 10/11] Silence complaints about leaks in
 load_domaintype_info().

This code intentionally leaks some intermediate cruft into the
long-lived DomainConstraintCache's memory context, reasoning that the
amount of leakage will typically not be much so it's not worth doing
a copyObject() of the final tree to avoid that.  But Valgrind knows
nothing of engineering tradeoffs and complains anyway.  So modify the
code to do it the slow clean way when USE_VALGRIND.

Author: Tom Lane <t...@sss.pgh.pa.us>
Discussion: https://postgr.es/m/285483.1746756...@sss.pgh.pa.us
---
 src/backend/utils/cache/typcache.c | 23 ++++++++++++++++++++++-
 1 file changed, 22 insertions(+), 1 deletion(-)

diff --git a/src/backend/utils/cache/typcache.c b/src/backend/utils/cache/typcache.c
index f9aec38a11f..bf0e362eae6 100644
--- a/src/backend/utils/cache/typcache.c
+++ b/src/backend/utils/cache/typcache.c
@@ -1189,8 +1189,20 @@ load_domaintype_info(TypeCacheEntry *typentry)
 				dcc->dccRefCount = 0;
 			}
 
-			/* Create node trees in DomainConstraintCache's context */
+			/*
+			 * The clean way to do the next bit is to run stringToNode() and
+			 * expression_planner() in the short-lived caller's context, then
+			 * copy the result into the DomainConstraintCache's dccContext.
+			 * Under Valgrind, we do it that way to silence complaints about
+			 * leaking cruft into dccContext.  However, the amount of stuff
+			 * leaked is usually pretty minimal, and the cost of the extra
+			 * copyObject() call is not negligible.  So when not catering to
+			 * Valgrind, we do it all in dccContext and accept some leakage.
+			 */
+#ifndef USE_VALGRIND
+			/* Do all this work in DomainConstraintCache's context */
 			oldcxt = MemoryContextSwitchTo(dcc->dccContext);
+#endif
 
 			check_expr = (Expr *) stringToNode(constring);
 
@@ -1206,10 +1218,19 @@ load_domaintype_info(TypeCacheEntry *typentry)
 			 */
 			check_expr = expression_planner(check_expr);
 
+#ifdef USE_VALGRIND
+			/* Create only the minimally needed stuff in dccContext */
+			oldcxt = MemoryContextSwitchTo(dcc->dccContext);
+#endif
+
 			r = makeNode(DomainConstraintState);
 			r->constrainttype = DOM_CONSTRAINT_CHECK;
 			r->name = pstrdup(NameStr(c->conname));
+#ifdef USE_VALGRIND
+			r->check_expr = copyObject(check_expr);
+#else
 			r->check_expr = check_expr;
+#endif
 			r->check_exprstate = NULL;
 
 			MemoryContextSwitchTo(oldcxt);
-- 
2.43.5

From 7adfb9e9dee4725c8950a49dabd9cac5999bb691 Mon Sep 17 00:00:00 2001
From: Tom Lane <t...@sss.pgh.pa.us>
Date: Sun, 11 May 2025 14:50:10 -0400
Subject: [PATCH v1 11/11] WIP: reduce leakages during PL/pgSQL function
 compilation.

format_procedure leaks memory, so run it in a short-lived context
not the session-lifespan cache context for the PL/pgSQL function.

parse_datatype called the core parser in the function's long-lived
cache context, thus leaking potentially a lot of storage into that
context.  We were also being a bit careless with the TypeName
structures made in that code path and others.  Most of the time we
don't need to retain the TypeName, so make sure it is made in the
short-lived temp context, and copy it only if we do need to retain it.

These are far from the only leaks in PL/pgSQL compilation, but
they're the biggest as far as I've seen.  However, this is just
WIP and may not get committed in this form at all: it's not clear
that this line of attack is enough to remove all leaks in this
area.  It might be better to try to run exclusively within
the short-term context and explicitly allocate what needs to
go into the function parsetree.

Author: Tom Lane <t...@sss.pgh.pa.us>
Discussion: https://postgr.es/m/285483.1746756...@sss.pgh.pa.us
---
 src/pl/plpgsql/src/pl_comp.c | 28 ++++++++++++++++++++++------
 src/pl/plpgsql/src/pl_gram.y |  8 +++++++-
 src/tools/valgrind.supp      | 11 -----------
 3 files changed, 29 insertions(+), 18 deletions(-)

diff --git a/src/pl/plpgsql/src/pl_comp.c b/src/pl/plpgsql/src/pl_comp.c
index 519f7695d7c..5f3e8646fbb 100644
--- a/src/pl/plpgsql/src/pl_comp.c
+++ b/src/pl/plpgsql/src/pl_comp.c
@@ -177,6 +177,7 @@ plpgsql_compile_callback(FunctionCallInfo fcinfo,
 	yyscan_t	scanner;
 	Datum		prosrcdatum;
 	char	   *proc_source;
+	char	   *proc_signature;
 	HeapTuple	typeTup;
 	Form_pg_type typeStruct;
 	PLpgSQL_variable *var;
@@ -223,6 +224,9 @@ plpgsql_compile_callback(FunctionCallInfo fcinfo,
 	plpgsql_check_syntax = forValidator;
 	plpgsql_curr_compile = function;
 
+	/* format_procedure leaks memory, so run it in temp context */
+	proc_signature = format_procedure(fcinfo->flinfo->fn_oid);
+
 	/*
 	 * All the permanent output of compilation (e.g. parse tree) is kept in a
 	 * per-function memory context, so it can be reclaimed easily.
@@ -232,7 +236,7 @@ plpgsql_compile_callback(FunctionCallInfo fcinfo,
 									 ALLOCSET_DEFAULT_SIZES);
 	plpgsql_compile_tmp_cxt = MemoryContextSwitchTo(func_cxt);
 
-	function->fn_signature = format_procedure(fcinfo->flinfo->fn_oid);
+	function->fn_signature = pstrdup(proc_signature);
 	MemoryContextSetIdentifier(func_cxt, function->fn_signature);
 	function->fn_oid = fcinfo->flinfo->fn_oid;
 	function->fn_input_collation = fcinfo->fncollation;
@@ -1658,6 +1662,11 @@ plpgsql_parse_wordrowtype(char *ident)
 {
 	Oid			classOid;
 	Oid			typOid;
+	TypeName   *typName;
+	MemoryContext oldCxt;
+
+	/* Avoid memory leaks in long-term function context */
+	oldCxt = MemoryContextSwitchTo(plpgsql_compile_tmp_cxt);
 
 	/*
 	 * Look up the relation.  Note that because relation rowtypes have the
@@ -1680,9 +1689,12 @@ plpgsql_parse_wordrowtype(char *ident)
 				 errmsg("relation \"%s\" does not have a composite type",
 						ident)));
 
+	typName = makeTypeName(ident);
+
+	MemoryContextSwitchTo(oldCxt);
+
 	/* Build and return the row type struct */
-	return plpgsql_build_datatype(typOid, -1, InvalidOid,
-								  makeTypeName(ident));
+	return plpgsql_build_datatype(typOid, -1, InvalidOid, typName);
 }
 
 /* ----------
@@ -1696,6 +1708,7 @@ plpgsql_parse_cwordrowtype(List *idents)
 	Oid			classOid;
 	Oid			typOid;
 	RangeVar   *relvar;
+	TypeName   *typName;
 	MemoryContext oldCxt;
 
 	/*
@@ -1718,11 +1731,12 @@ plpgsql_parse_cwordrowtype(List *idents)
 				 errmsg("relation \"%s\" does not have a composite type",
 						relvar->relname)));
 
+	typName = makeTypeNameFromNameList(idents);
+
 	MemoryContextSwitchTo(oldCxt);
 
 	/* Build and return the row type struct */
-	return plpgsql_build_datatype(typOid, -1, InvalidOid,
-								  makeTypeNameFromNameList(idents));
+	return plpgsql_build_datatype(typOid, -1, InvalidOid, typName);
 }
 
 /*
@@ -1937,6 +1951,8 @@ plpgsql_build_recfield(PLpgSQL_rec *rec, const char *fldname)
  * origtypname is the parsed form of what the user wrote as the type name.
  * It can be NULL if the type could not be a composite type, or if it was
  * identified by OID to begin with (e.g., it's a function argument type).
+ * origtypname is in short-lived storage and must be copied if we choose
+ * to incorporate it into the function's parse tree.
  */
 PLpgSQL_type *
 plpgsql_build_datatype(Oid typeOid, int32 typmod,
@@ -2055,7 +2071,7 @@ build_datatype(HeapTuple typeTup, int32 typmod,
 					 errmsg("type %s is not composite",
 							format_type_be(typ->typoid))));
 
-		typ->origtypname = origtypname;
+		typ->origtypname = copyObject(origtypname);
 		typ->tcache = typentry;
 		typ->tupdesc_id = typentry->tupDesc_identifier;
 	}
diff --git a/src/pl/plpgsql/src/pl_gram.y b/src/pl/plpgsql/src/pl_gram.y
index 5612e66d023..6d53da4e79a 100644
--- a/src/pl/plpgsql/src/pl_gram.y
+++ b/src/pl/plpgsql/src/pl_gram.y
@@ -3848,6 +3848,7 @@ parse_datatype(const char *string, int location, yyscan_t yyscanner)
 	int32		typmod;
 	sql_error_callback_arg cbarg;
 	ErrorContextCallback syntax_errcontext;
+	MemoryContext oldCxt;
 
 	cbarg.location = location;
 	cbarg.yyscanner = yyscanner;
@@ -3857,9 +3858,14 @@ parse_datatype(const char *string, int location, yyscan_t yyscanner)
 	syntax_errcontext.previous = error_context_stack;
 	error_context_stack = &syntax_errcontext;
 
-	/* Let the main parser try to parse it under standard SQL rules */
+	/*
+	 * Let the main parser try to parse it under standard SQL rules.  The
+	 * parser leaks memory, so run it in temp context.
+	 */
+	oldCxt = MemoryContextSwitchTo(plpgsql_compile_tmp_cxt);
 	typeName = typeStringToTypeName(string, NULL);
 	typenameTypeIdAndMod(NULL, typeName, &type_id, &typmod);
+	MemoryContextSwitchTo(oldCxt);
 
 	/* Restore former ereport callback */
 	error_context_stack = syntax_errcontext.previous;
diff --git a/src/tools/valgrind.supp b/src/tools/valgrind.supp
index 7d4fb0b2c1d..80268d6eed8 100644
--- a/src/tools/valgrind.supp
+++ b/src/tools/valgrind.supp
@@ -201,17 +201,6 @@
    fun:MemoryContextAllocAligned
 }
 
-# Suppress complaints about stuff leaked by PL/pgSQL parsing.
-# This ought to be fixed properly, instead.
-{
-   hide_plpgsql_parsing_leaks
-   Memcheck:Leak
-   match-leak-kinds: definite,possible,indirect
-
-   ...
-   fun:plpgsql_compile_callback
-}
-
 # And likewise for stuff leaked by SQL-function parsing.
 # This ought to be fixed properly, instead.
 {
-- 
2.43.5

Re: Why our Valgrind reports suck

Reply via email to