Hello Thomas, 31.08.2023 14:15, Thomas Munro wrote:
We have a signal that is pending and not blocked, so I don't immediately know why poll() hasn't returned control.
When I worked at the Postgres Pro company, we observed a similar lockup under rather specific conditions (we used Elbrus CPU and the specific Elbrus compiler (lcc) based on edg). I managed to reproduce that lockup and Anton Voloshin investigated it. The issue was caused by the compiler optimization in WaitEventSetWait(): waiting = true; ... while (returned_events == 0) { ... if (set->latch && set->latch->is_set) { ... break; } In that case, compiler decided that it may place the read "set->latch->is_set" before the write "waiting = true". (Placing "pg_compiler_barrier();" just after "waiting = true;" fixed the issue for us.) I can't provide more details for now, but maybe you could look at the binary code generated on the target platform to confirm or reject my guess. Best regards, Alexander