Hi,

On 2026-03-17 16:56:48 -0400, Tom Lane wrote:
> "Peter 'PMc' Much" <[email protected]> writes:
> > On Tue, Mar 17, 2026 at 10:12:07AM -0400, Tom Lane wrote:
> > ! Why it was okay in older FreeBSD and not so much in v14, who knows?
>
> > Maybe it wasn't. Here it appeared out of thin air in February, while
> > the system was upgraded from 13.5 to 14.3 in July'25, and did run
> > without problems for these eight months.
> > So this is not directly or solely related to FBSD R.14, and while it
> > happens more likely during massive memory use, but this also is not
> > stingent. Neither did I find any other solid determining condition.
>
> Yeah, it seems likely that there is some additional triggering
> condition that we don't understand; otherwise there would be more
> people complaining than just you.

One issue we've seen in the past (on some other BSD, I think NetBSD?) is
signal handlers used a C function in a shared library, the function was never
used before the signal handler, and that dynamic symbol resolution allocated
memory. Which then contributed to deadlocks and/or corruption of alloctor
metadata.

You could check if that's a factor by exporting LD_BIND_NOW.


The way the signal handling worked before 16 should not really lead to corrupt
allocator datastructures, as the signal handler is only allowed to run in a
period in which the normal execution is suspended (or only calls async signal
safe code, e.g. after waking up, until reaching the sigmask calls to block the
signal again).  ISTM, there either needed to be another signal handler that
allocated memory that was interrupted by SIGUSR1 or that postmaster allocated
memory while the signal was unmasked.  The dynamic linker doing function
resolution could be an explanation.

Greetings,

Andres Freund


Reply via email to