Re: Clean up NamedLWLockTranche stuff
On 28/03/2026 19:20, Sami Imseih wrote: Hi Heikki, Just raising this again to make sure it doesn’t get overlooked [1]. Fixed, thanks! - Heikki
Re: Clean up NamedLWLockTranche stuff
Hi Heikki, Just raising this again to make sure it doesn’t get overlooked [1]. Thanks! [1] [https://www.postgresql.org/message-id/CAA5RZ0vPWNMvTBqyH7nqDRrHd6Y4Et5iNqXFuwpbsPOk3cL4rQ%40mail.gmail.com] -- Sami
Re: Clean up NamedLWLockTranche stuff
On 28/03/2026 00:10, Nathan Bossart wrote: On Sat, Mar 28, 2026 at 12:07:26AM +0200, Heikki Linnakangas wrote: LGTM, thanks! Will you commit or want me to pick it up? I'm not able to commit it right this second, so feel free to take it. Else it'll probably be a day or two before I can get to it. Ok, committed, thanks! - Heikki
Re: Clean up NamedLWLockTranche stuff
On Sat, Mar 28, 2026 at 12:07:26AM +0200, Heikki Linnakangas wrote: > LGTM, thanks! Will you commit or want me to pick it up? I'm not able to commit it right this second, so feel free to take it. Else it'll probably be a day or two before I can get to it. -- nathan
Re: Clean up NamedLWLockTranche stuff
On 28/03/2026 00:05, Nathan Bossart wrote:
On Fri, Mar 27, 2026 at 04:50:12PM -0500, Nathan Bossart wrote:
On Fri, Mar 27, 2026 at 05:22:33PM -0400, Andres Freund wrote:
TRAP: failed Assert("MemoryContextIsValid(context)"), File: "mcxt.c", Line:
1270, PID: 230491
[...](ExceptionalCondition+0x54)[0xe186c204]
[...](MemoryContextAllocExtended+0x0)[0xe18a2a24]
[...](RequestNamedLWLockTranche+0x6c)[0xe16e7310]
[...](process_shmem_requests+0x28)[0xe1881628]
[...](PostgresSingleUserMain+0xc4)[0xe1701a34]
[...](main+0x6ac)[0xe12a2adc]
/lib/aarch64-linux-gnu/libc.so.6(__libc_start_main+0xe8)[0x99713dd8]
[...](+0xf2b98)[0xe12a2b98]
Aborted
pg_rewind: error: postgres single-user mode in target cluster failed
Hm. AFAICT PostmasterContext isn't created in single-user mode, and the
commit in question has RequestNamedLWLockTranche() allocate requests there.
I guess the idea is to allow backends to free that memory after forking
from postmaster, but we don't do that for the NamedLWLockTrancheRequests
list. Maybe we should surround the last part of that function with
MemoryContextSwitchTo(...) to either TopMemoryContext or PostmasterContext
depending on whether we're in single-user mode.
Concretely, like the attached.
LGTM, thanks! Will you commit or want me to pick it up?
- Heikki
Re: Clean up NamedLWLockTranche stuff
On Fri, Mar 27, 2026 at 04:50:12PM -0500, Nathan Bossart wrote:
> On Fri, Mar 27, 2026 at 05:22:33PM -0400, Andres Freund wrote:
>> TRAP: failed Assert("MemoryContextIsValid(context)"), File: "mcxt.c", Line:
>> 1270, PID: 230491
>> [...](ExceptionalCondition+0x54)[0xe186c204]
>> [...](MemoryContextAllocExtended+0x0)[0xe18a2a24]
>> [...](RequestNamedLWLockTranche+0x6c)[0xe16e7310]
>> [...](process_shmem_requests+0x28)[0xe1881628]
>> [...](PostgresSingleUserMain+0xc4)[0xe1701a34]
>> [...](main+0x6ac)[0xe12a2adc]
>> /lib/aarch64-linux-gnu/libc.so.6(__libc_start_main+0xe8)[0x99713dd8]
>> [...](+0xf2b98)[0xe12a2b98]
>> Aborted
>> pg_rewind: error: postgres single-user mode in target cluster failed
>
> Hm. AFAICT PostmasterContext isn't created in single-user mode, and the
> commit in question has RequestNamedLWLockTranche() allocate requests there.
> I guess the idea is to allow backends to free that memory after forking
> from postmaster, but we don't do that for the NamedLWLockTrancheRequests
> list. Maybe we should surround the last part of that function with
> MemoryContextSwitchTo(...) to either TopMemoryContext or PostmasterContext
> depending on whether we're in single-user mode.
Concretely, like the attached.
--
nathan
>From 714c765399656c9743e3ad76fe52d1cfc8994952 Mon Sep 17 00:00:00 2001
From: Nathan Bossart
Date: Fri, 27 Mar 2026 17:00:49 -0500
Subject: [PATCH 1/1] fix RequestNamedLWLockTranche
---
src/backend/storage/lmgr/lwlock.c | 10 +-
1 file changed, 9 insertions(+), 1 deletion(-)
diff --git a/src/backend/storage/lmgr/lwlock.c
b/src/backend/storage/lmgr/lwlock.c
index 7a68071302a..f5b2a6d479d 100644
--- a/src/backend/storage/lmgr/lwlock.c
+++ b/src/backend/storage/lmgr/lwlock.c
@@ -630,6 +630,7 @@ void
RequestNamedLWLockTranche(const char *tranche_name, int num_lwlocks)
{
NamedLWLockTrancheRequest *request;
+ MemoryContext oldcontext;
if (!process_shmem_requests_in_progress)
elog(FATAL, "cannot request additional LWLocks outside
shmem_request_hook");
@@ -652,10 +653,17 @@ RequestNamedLWLockTranche(const char *tranche_name, int
num_lwlocks)
errdetail("No more than %d tranches may be
registered.",
MAX_USER_DEFINED_TRANCHES)));
- request = MemoryContextAllocZero(PostmasterContext,
sizeof(NamedLWLockTrancheRequest));
+ if (IsPostmasterEnvironment)
+ oldcontext = MemoryContextSwitchTo(PostmasterContext);
+ else
+ oldcontext = MemoryContextSwitchTo(TopMemoryContext);
+
+ request = palloc0(sizeof(NamedLWLockTrancheRequest));
strlcpy(request->tranche_name, tranche_name, NAMEDATALEN);
request->num_lwlocks = num_lwlocks;
NamedLWLockTrancheRequests = lappend(NamedLWLockTrancheRequests,
request);
+
+ MemoryContextSwitchTo(oldcontext);
}
/*
--
2.50.1 (Apple Git-155)
Re: Clean up NamedLWLockTranche stuff
On Fri, Mar 27, 2026 at 05:22:33PM -0400, Andres Freund wrote:
> TRAP: failed Assert("MemoryContextIsValid(context)"), File: "mcxt.c", Line:
> 1270, PID: 230491
> [...](ExceptionalCondition+0x54)[0xe186c204]
> [...](MemoryContextAllocExtended+0x0)[0xe18a2a24]
> [...](RequestNamedLWLockTranche+0x6c)[0xe16e7310]
> [...](process_shmem_requests+0x28)[0xe1881628]
> [...](PostgresSingleUserMain+0xc4)[0xe1701a34]
> [...](main+0x6ac)[0xe12a2adc]
> /lib/aarch64-linux-gnu/libc.so.6(__libc_start_main+0xe8)[0x99713dd8]
> [...](+0xf2b98)[0xe12a2b98]
> Aborted
> pg_rewind: error: postgres single-user mode in target cluster failed
Hm. AFAICT PostmasterContext isn't created in single-user mode, and the
commit in question has RequestNamedLWLockTranche() allocate requests there.
I guess the idea is to allow backends to free that memory after forking
from postmaster, but we don't do that for the NamedLWLockTrancheRequests
list. Maybe we should surround the last part of that function with
MemoryContextSwitchTo(...) to either TopMemoryContext or PostmasterContext
depending on whether we're in single-user mode.
--
nathan
Re: Clean up NamedLWLockTranche stuff
Hi,
On 2026-03-27 11:45:56 +0200, Heikki Linnakangas wrote:
> Committed with that little change, thanks!
This seems to have broken buildfarm animal batta:
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=batta&dt=2026-03-27%2002%3A05%3A01
# Running: pg_rewind --debug --source-pgdata
/home/admin/batta/buildroot/HEAD/pgsql.build/src/bin/pg_rewind/tmp_check/t_001_basic_standby_local_data/pgdata
--target-pgdata
/home/admin/batta/buildroot/HEAD/pgsql.build/src/bin/pg_rewind/tmp_check/t_001_basic_primary_local_data/pgdata
--no-sync --config-file
/home/admin/batta/buildroot/HEAD/pgsql.build/src/bin/pg_rewind/tmp_check/tmp_test_QbsG/primary-postgresql.conf.tmp
pg_rewind: executing
"/home/admin/batta/buildroot/HEAD/pgsql.build/tmp_install/home/admin/batta/buildroot/HEAD/inst/bin/postgres"
for target server to complete crash recovery
TRAP: failed Assert("MemoryContextIsValid(context)"), File: "mcxt.c", Line:
1270, PID: 230491
/home/admin/batta/buildroot/HEAD/pgsql.build/tmp_install/home/admin/batta/buildroot/HEAD/inst/bin/postgres(ExceptionalCondition+0x54)[0xe186c204]
/home/admin/batta/buildroot/HEAD/pgsql.build/tmp_install/home/admin/batta/buildroot/HEAD/inst/bin/postgres(MemoryContextAllocExtended+0x0)[0xe18a2a24]
/home/admin/batta/buildroot/HEAD/pgsql.build/tmp_install/home/admin/batta/buildroot/HEAD/inst/bin/postgres(RequestNamedLWLockTranche+0x6c)[0xe16e7310]
/home/admin/batta/buildroot/HEAD/pgsql.build/tmp_install/home/admin/batta/buildroot/HEAD/inst/bin/postgres(process_shmem_requests+0x28)[0xe1881628]
/home/admin/batta/buildroot/HEAD/pgsql.build/tmp_install/home/admin/batta/buildroot/HEAD/inst/bin/postgres(PostgresSingleUserMain+0xc4)[0xe1701a34]
/home/admin/batta/buildroot/HEAD/pgsql.build/tmp_install/home/admin/batta/buildroot/HEAD/inst/bin/postgres(main+0x6ac)[0xe12a2adc]
/lib/aarch64-linux-gnu/libc.so.6(__libc_start_main+0xe8)[0x99713dd8]
/home/admin/batta/buildroot/HEAD/pgsql.build/tmp_install/home/admin/batta/buildroot/HEAD/inst/bin/postgres(+0xf2b98)[0xe12a2b98]
Aborted
pg_rewind: error: postgres single-user mode in target cluster failed
pg_rewind: detail: Command was:
/home/admin/batta/buildroot/HEAD/pgsql.build/tmp_install/home/admin/batta/buildroot/HEAD/inst/bin/postgres
--single -F -D
/home/admin/batta/buildroot/HEAD/pgsql.build/src/bin/pg_rewind/tmp_check/t_001_basic_primary_local_data/pgdata
-c
config_file=/home/admin/batta/buildroot/HEAD/pgsql.build/src/bin/pg_rewind/tmp_check/tmp_test_QbsG/primary-postgresql.conf.tmp
template1 < /dev/null
Presumably the reason that batta failed is its special configuration:
shared_preload_libraries = 'pg_stat_statements'; regress_dump_restore;
wal_consistency_checking; compute_query_id = regress; --enable-injection-points
Greetings,
Andres Freund
Re: Clean up NamedLWLockTranche stuff
> Committed with that little change, thanks! Thanks! I think there is one more comment cleanup in lwlock.c /* - * This points to the main array of LWLocks in shared memory. Backends inherit - * the pointer by fork from the postmaster (except in the EXEC_BACKEND case, - * where we have special measures to pass it down). + * This points to the main array of LWLocks in shared memory. */ we no longer need to take special measures to pass down MainLWLockArray through the BackendParameters. -- Sami v1-0001-Remove-another-outdated-comment-regading-MainLWLo.patch Description: Binary data
Re: Clean up NamedLWLockTranche stuff
On 27/03/2026 06:49, Sami Imseih wrote: +/* backend-local copy of NamedLWLockTranches->num_user_defined */ +static int LocalNumUserDefinedTranches; The comment here should reference "LWLockTranches->num_user_defined " instead. Also, there are a few places in lwlock.c where "named tranches" is mentioned. Maybe we should just say "user-defined tranches" instead? Like the attached. @@ -460,7 +460,7 @@ LWLockShmemInit(void) } /* - * Initialize LWLocks that are fixed and those belonging to named tranches. + * Initialize LWLocks that are fixed and those belonging to user-defined tranches. */ static void InitializeLWLocks(int numLocks) Only tranches requested with RequestNamedLWLockTranche() have locks in the main array, so I reworded this some more to: /* * Initialize LWLocks for built-in tranches and those requested with * RequestNamedLWLockTranche(). */ Committed with that little change, thanks! - Heikki
Re: Clean up NamedLWLockTranche stuff
> +/* backend-local copy of NamedLWLockTranches->num_user_defined */ > +static int LocalNumUserDefinedTranches; > The comment here should reference "LWLockTranches->num_user_defined " > instead. > Also, there are a few places in lwlock.c where "named tranches" is mentioned. > Maybe we should just say "user-defined tranches" instead? Like the attached. -- Sami v1-0001-fix-some-comments-for-lwlock-tranches.patch Description: Binary data
Re: Clean up NamedLWLockTranche stuff
Hi, > > Thanks! > > > > On 26/03/2026 18:34, Sami Imseih wrote: > >>> I propose the attached refactorings to make this less confusing. See > >>> commit messages for details. > >> > >> I only took a look at 0001 so far, and I do agree with this statement > >> in the commit message: > > I committed these now, but I'm all ears if you still have comments on > the rest of the patches. Sorry for the delay. I see you committed the rest. The only issue I found is with d6eba30 +/* backend-local copy of NamedLWLockTranches->num_user_defined */ +static int LocalNumUserDefinedTranches; The comment here should reference "LWLockTranches->num_user_defined " instead. >> rename RequestNamedLWLockTranche() to RequestUserDefinedLWLockTranche() >> and LWLockNewTrancheId() to RegisterUserDefinedLWLockTranche() > I'd rather not change RequestNamedLWLockTranche(), because I think > LWLockNewTrancheId() is better and should be used in new code. That's fair. >> v19 is already changing the signature of LWLockNewTrancheId(), so maybe >> improving the names of these APIs makes sense to do. > Oh, I didn't realize we changed the LWLockNewTrancheId() signature! > Yeah, if we're changing it anyway, we might as well rename it. I'm not > sure if I like RegisterUserDefinedLWLockTranche() better, but let's > think it through. Maybe, RegisterNewLWLockTrancheId() could be more meaningful? Also, there are a few places in lwlock.c where "named tranches" is mentioned. Maybe we should just say "user-defined tranches" instead? -- Sami Imseih Amazon Web Services (AWS)
Re: Clean up NamedLWLockTranche stuff
On 26/03/2026 18:57, Heikki Linnakangas wrote: Thanks! On 26/03/2026 18:34, Sami Imseih wrote: I propose the attached refactorings to make this less confusing. See commit messages for details. I only took a look at 0001 so far, and I do agree with this statement in the commit message: I committed these now, but I'm all ears if you still have comments on the rest of the patches. - Heikki
Re: Clean up NamedLWLockTranche stuff
On 26/03/2026 16:37, Nathan Bossart wrote: On Thu, Mar 26, 2026 at 02:16:52PM +0200, Heikki Linnakangas wrote: 0002: + foreach(lc, NamedLWLockTrancheRequests) nitpick: These foreach loops seem like good opportunities to use foreach_ptr. The comment atop NumLWLocksForNamedTranches might benefit from mentioning RequestNamedLWLockTranche() and the fact that it only works in the postmaster. Perhaps an assertion is warranted, too. There's already this check in RequestNamedLWLockTranche(): if (!process_shmem_requests_in_progress) elog(FATAL, "cannot request additional LWLocks outside shmem_request_hook"); shmem_request_hooks are only called early at postmaster startup. + SpinLockAcquire(ShmemLock); + LocalNumUserDefinedTranches = LWLockTranches->num_user_defined; + SpinLockRelease(ShmemLock); Not critical, but it might be worth making num_user_defined an atomic. Yeah I considered that. The lock is still needed in LWLockNewTrancheId(), though, to prevent two concurrent LWLockNewTrancheId() calls from running concurrently. Using an atomic would allow the extra optimization of reading the value without acquiring spinlock, but it seems more clear to have a clear-cut rule that you must always hold the spinlock whenever accessing the field. 0004: +++ b/src/backend/storage/ipc/shmem.c @@ -379,7 +379,8 @@ ShmemInitStruct(const char *name, Size size, bool *foundPtr) Assert(ShmemIndex != NULL); - LWLockAcquire(ShmemIndexLock, LW_EXCLUSIVE); + if (IsUnderPostmaster) + LWLockAcquire(ShmemIndexLock, LW_EXCLUSIVE); Am I understanding that we assume ShmemInitStruct() is only called by the postmaster when there are no other backends yet? Yeah. LWLockAcquire has this: /* * We can't wait if we haven't got a PGPROC. This should only occur * during bootstrap or shared memory initialization. Put an Assert here * to catch unsafe coding practices. */ Assert(!(proc == NULL && IsUnderPostmaster)); To be honest I didn't realize we tolerate that, calling LWLockAcquire in postmaster, until I started to work on this. It might be worth having some extra sanity checks here, to e.g. to throw an error if LWLockAcquire is called from postmaster after startup. But this isn't new. 0005: - if (IsUnderPostmaster) - LWLockAcquire(ShmemIndexLock, LW_EXCLUSIVE); + LWLockAcquire(ShmemIndexLock, LW_EXCLUSIVE); Oh, this reverts many of these changes from 0004. Maybe the patches could be reordered to avoid this? Makes sense. Thanks for the review! - Heikki
Re: Clean up NamedLWLockTranche stuff
Thanks!
On 26/03/2026 18:34, Sami Imseih wrote:
I propose the attached refactorings to make this less confusing. See
commit messages for details.
I only took a look at 0001 so far, and I do agree with this statement
in the commit message:
"The "user defined" term was already used in LWTRANCHE_FIRST_USER_DEFINED,
so let's standardize on that to mean tranches allocated with either
RequestNamedLWLockTranche() or LWLockNewTrancheId()."
I do wonder if 0001 is going far enough though.
Instead of just standardizing that "user defined" could mean tranches allocated
with RequestNamedLWLockTranche() or LWLockNewTrancheId(), how about we also
rename these APIs to reflect that as well? This way we remove all concept of
"named tranche" which is what it sounds like to me you are proposing.
rename RequestNamedLWLockTranche() to RequestUserDefinedLWLockTranche()
and LWLockNewTrancheId() to RegisterUserDefinedLWLockTranche()
I'd rather not change RequestNamedLWLockTranche(), because I think
LWLockNewTrancheId() is better and should be used in new code. I
consider RequestNamedLWLockTranche() to be a legacy function, for
backwards compatibility.
RequestNamedLWLockTranche() requests the lwlock at shmem_request time,
which is later registered via LWLockNewTrancheId() when lwlocks are
initialized by the postmaster.
Also, the name LWLockNewTrancheId() is selling what this function does
too short.
It does return a new tranche ID, but it also takes in a user-defined tranche
name and copies ("registers") that name into LWLockTrancheNames.
v19 is already changing the signature of LWLockNewTrancheId(), so maybe
improving the names of these APIs makes sense to do.
Oh, I didn't realize we changed the LWLockNewTrancheId() signature!
Yeah, if we're changing it anyway, we might as well rename it. I'm not
sure if I like RegisterUserDefinedLWLockTranche() better, but let's
think it through.
- Heikki
Re: Clean up NamedLWLockTranche stuff
Hi,
Thanks for the patches!
> I propose the attached refactorings to make this less confusing. See
> commit messages for details.
I only took a look at 0001 so far, and I do agree with this statement
in the commit message:
"The "user defined" term was already used in LWTRANCHE_FIRST_USER_DEFINED,
so let's standardize on that to mean tranches allocated with either
RequestNamedLWLockTranche() or LWLockNewTrancheId()."
I do wonder if 0001 is going far enough though.
Instead of just standardizing that "user defined" could mean tranches allocated
with RequestNamedLWLockTranche() or LWLockNewTrancheId(), how about we also
rename these APIs to reflect that as well? This way we remove all concept of
"named tranche" which is what it sounds like to me you are proposing.
rename RequestNamedLWLockTranche() to RequestUserDefinedLWLockTranche()
and LWLockNewTrancheId() to RegisterUserDefinedLWLockTranche()
RequestNamedLWLockTranche() requests the lwlock at shmem_request time,
which is later registered via LWLockNewTrancheId() when lwlocks are
initialized by the postmaster.
Also, the name LWLockNewTrancheId() is selling what this function does
too short.
It does return a new tranche ID, but it also takes in a user-defined tranche
name and copies ("registers") that name into LWLockTrancheNames.
v19 is already changing the signature of LWLockNewTrancheId(), so maybe
improving the names of these APIs makes sense to do.
--
Sami Imseih
Amazon Web Services (AWS)
Re: Clean up NamedLWLockTranche stuff
On Thu, Mar 26, 2026 at 02:16:52PM +0200, Heikki Linnakangas wrote: > At postmaster startup, NamedLWLockTrancheRequests points to a > backend-private array. But after startup, and always in backends, it points > to a copy in shared memory and LocalNamedLWLockTrancheRequestArray is used > to hold the original. It took me a while to realize that > NamedLWLockTrancheRequests in shared memory is *not* updated when you call > LWLockNewTrancheId(), it only holds the requests made with > RequestNamedLWLockTranche() before startup. Right. LocalNamedLWLockTrancheRequestArray is needed so that we can re-initialize shared memory after a crash. See commit c3cc2ab87d. > I propose the attached refactorings to make this less confusing. See commit > messages for details. Thanks for doing this, Heikki. I agree that we ought to make this stuff cleaner. I've asked Sami Imseih, who worked on LWLocks with me last year, to look at this patch set, too. > Subject: [PATCH v1 1/5] Rename MAX_NAMED_TRANCHES to MAX_USER_DEFINED_TRANCHES Seems fine to me. 0002: > + foreach(lc, NamedLWLockTrancheRequests) nitpick: These foreach loops seem like good opportunities to use foreach_ptr. The comment atop NumLWLocksForNamedTranches might benefit from mentioning RequestNamedLWLockTranche() and the fact that it only works in the postmaster. Perhaps an assertion is warranted, too. + SpinLockAcquire(ShmemLock); + LocalNumUserDefinedTranches = LWLockTranches->num_user_defined; + SpinLockRelease(ShmemLock); Not critical, but it might be worth making num_user_defined an atomic. Overall, 0002 looks reasonable to me upon a first read-through. > Subject: [PATCH v1 3/5] Use a separate spinlock to protect LWLockTranches Seems fine to me. 0004: > +++ b/src/backend/storage/ipc/shmem.c > @@ -379,7 +379,8 @@ ShmemInitStruct(const char *name, Size size, bool > *foundPtr) > > Assert(ShmemIndex != NULL); > > - LWLockAcquire(ShmemIndexLock, LW_EXCLUSIVE); > + if (IsUnderPostmaster) > + LWLockAcquire(ShmemIndexLock, LW_EXCLUSIVE); Am I understanding that we assume ShmemInitStruct() is only called by the postmaster when there are no other backends yet? 0005: > - if (IsUnderPostmaster) > - LWLockAcquire(ShmemIndexLock, LW_EXCLUSIVE); > + LWLockAcquire(ShmemIndexLock, LW_EXCLUSIVE); Oh, this reverts many of these changes from 0004. Maybe the patches could be reordered to avoid this? -- nathan
