Em qua., 3 de set. de 2025 às 03:34, Mikhail Kot <mikhail....@databricks.com> escreveu:
> Hi, > > I've encountered the following segmentation fault lately. It happens when > Postgres is experiencing high memory pressure. There are multiple OOM > errors in > the log as well. > > Core was generated by `postgres: neondb_owner neondb ::1(46658) BIND > '. > Program terminated with signal SIGSEGV, Segmentation fault. > #0 pg_atomic_read_u32_impl (ptr=0x8) at > ../../../../src/include/port/atomics/generic.h:48 > #1 pg_atomic_read_u32 (ptr=0x8) at > ../../../../src/include/port/atomics.h:239 > #2 LWLockAttemptLock (lock=lock@entry=0x4, > mode=mode@entry=LW_EXCLUSIVE) at lwlock.c:821 > #3 0x000056446bce129f in LWLockConditionalAcquire (lock=0x4, > mode=mode@entry=LW_EXCLUSIVE) at lwlock.c:1386 > #4 0x000056446bd0bacf in pgstat_lock_entry > (entry_ref=entry_ref@entry=0x56446d9f4340, nowait=nowait@entry=true) > at pgstat_shmem.c:625 > #5 0x000056446bd0a3c9 in pgstat_relation_flush_cb > (entry_ref=0x56446d9f4340, nowait=<optimized out>) at > pgstat_relation.c:794 > #6 0x000056446bd069f5 in pgstat_flush_pending_entries > (nowait=<optimized out>) at pgstat.c:1217 > #7 pgstat_report_stat (force=<optimized out>, force@entry=false) at > pgstat.c:658 > #8 0x000056446bcf16c1 in PostgresMain (dbname=<optimized out>, > username=<optimized out>) at postgres.c:4623 > #9 0x000056446bc716b3 in BackendRun (port=<optimized out>, > port=<optimized out>) at postmaster.c:4465 > #10 BackendStartup (port=<optimized out>) at postmaster.c:4193 > #11 ServerLoop () at postmaster.c:1782 > #12 0x000056446bc726ea in PostmasterMain (argc=argc@entry=3, > argv=argv@entry=0x56446cd803b0) at postmaster.c:1466 > #13 0x000056446b9d5a00 in main (argc=3, argv=0x56446cd803b0) at main.c:238 > > The error originates from pgstat_shmem.c file where shhashent is left in > half-initialized state if pgstat_init_entry(), calling dsa_allocate0(), > errors > out with OOM. Then shhashent causes a segmentation fault on access. I > propose a > patch which solves this issue. The patch is for main branch, but the code > is > nearly identical in Postgres 13-17 so I suggest backporting it to other > supported versions. > > The patch changes pgstat_init_entry()'s behaviour, returning NULL if memory > allocation failed. I'm wondering if it wouldn't be better to raise elog(ERROR), and avoid many checks for this NULL. best regards, Ranier Vilela