Ni Nathan,

On Thu, Sep 11, 2025 at 09:15:12PM +0000, Nathan Bossart wrote:
> Move named LWLock tranche requests to shared memory.
> 
> In EXEC_BACKEND builds, GetNamedLWLockTranche() can segfault when
> called outside of the postmaster process, as it might access
> NamedLWLockTrancheRequestArray, which won't be initialized.  Given
> the lack of reports, this is apparently unusual, presumably because
> it is usually called from a shmem_startup_hook like this:

Since this commit has been merged, batta has kept failing.  Here is
the first failure:
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=batta&dt=2025-09-12%2002%3A05%3A01

I use this animal with a specific configuration:
shared_preload_libraries = 'pg_stat_statements'
compute_query_id = regress
regress_dump_restore
wal_consistency_checking
--enable-injection-points

The recovery tests 013_crash_restart.pl, 022_crash_temp_files.pl and
041_checkpoint_at_promote.pl stress some restart scenarios, not all
use injection points.  I could not get a backtrace from the host.

However, I have come up with the following change in 013 that's able
to reproduce what I think is the same crash:
--- a/src/test/recovery/t/013_crash_restart.pl
+++ b/src/test/recovery/t/013_crash_restart.pl
@@ -21,6 +21,8 @@ my $psql_timeout = 
IPC::Run::timer($PostgreSQL::Test::Utils::timeout_default);
 
 my $node = PostgreSQL::Test::Cluster->new('primary');
 $node->init(allows_streaming => 1);
+$node->append_conf('postgresql.conf',
+                   "shared_preload_libraries = 'pg_stat_statements'");
 $node->start();
 
And here is the backtrace:
#0  0x000055fcdf6bc97a in NumLWLocksForNamedTranches () at lwlock.c:385
385 numLocks += NamedLWLockTrancheRequestArray[i].num_lwlocks;
(gdb) bt
#0  0x000055fcdf6bc97a in NumLWLocksForNamedTranches () at lwlock.c:385 
#1  0x000055fcdf6bc9b3 in LWLockShmemSize () at lwlock.c:400 
#2  0x000055fcdf65bda5 in CalculateShmemSize (num_semaphores=0x7ffcaf7a78e4) at 
ipci.c:130 
#3  0x000055fcdf65c0b1 in CreateSharedMemoryAndSemaphores () at ipci.c:210 
#4  0x000055fcdf42830c in PostmasterStateMachine () at postmaster.c:3223 
#5  0x000055fcdf42703f in process_pm_child_exit () at postmaster.c:2558 
#6  0x000055fcdf425729 in ServerLoop () at postmaster.c:1696 
#7  0x000055fcdf424be1 in PostmasterMain (argc=4, argv=0x55fd0a8faa10) at 
postmaster.c:1403 
#8  0x000055fcdef80a19 in main (argc=4, argv=0x55fd0a8faa10) at main.c:231
(gdb) p i
$3 = 0
(gdb) p NamedLWLockTrancheRequestArray[0]
Cannot access memory at address 0x7f15ee4ccc08

Thanks,
--
Michael

Attachment: signature.asc
Description: PGP signature

Reply via email to