Hi,

On 2/15/26 09:53, Reshmithaa wrote:
> Hi,
> 
> We have developed a PostgreSQL extension and are currently encountering
> intermittent crashes related to shared memory on PostgreSQL 17.7. The
> issue does not occur everytime, and we have not been able to reliably
> reproduce it in our local environment. We have attached the relevant pg
> logs and backtraces below for reference.
> 
> We also noticed a recent commit addressing a DSM-related issue:
> 
> https://git.postgresql.org/gitweb/?
> p=postgresql.git;a=commitdiff;h=1d0fc2499 <https://git.postgresql.org/
> gitweb/?p=postgresql.git;a=commitdiff;h=1d0fc2499>
> 
> Could you please confirm whether this change can resolve the type of
> crash we are encountering?
> 

Maybe, but it's hard to say based on the logs you provided.

In the commit message you linked, Robert speculated this should not be
happening in core Postgres, but that maybe extensions could trip over
this. What extensions are you using? Especially extensions third-party
extensions?

It's interesting both traces end with a segfault, but in both cases
there are "strange" errors immediately before that. In particular the
first trace shows

[568804]:ERROR:  53200: out of memory
[568804]:DETAIL:  Failed while allocating entry 2/17266/55068384.
[568804]:LOCATION:  pgstat_get_entry_ref, pgstat_shmem.c:510
[568804]:STATEMENT:   ANALYZE tab2_temp

For the process that then crashes with a segfault. The second trace
unfortunately does not show what happened to PID 2037344 before it
crashes, which would be very interesting to know. Was it an OOM too?

But there are a couple other suspicious errors from other PIDs, like

[2136038]:ERROR:  XX000: dsa_area could not attach to a segment that has
been freed
[2136038]:LOCATION:  get_segment_by_index, dsa.c:1781

I don't think this should be happening in core code, at least I don't
recall seeing anything like that recently.


I wonder if the cleanup after OOM could lead to the crash because the
error cleanup destroys the short-lived context, in a way Robert did not
envision in the commit message. But I haven't tried and it's just a
speculation.

FWIW even if the commit (and upgrading to 17.8) fixes the crash, I don't
think that'll fix the other errors shown in the traces. You should
probably look into that.


regards

-- 
Tomas Vondra


Reply via email to