Wake up for latches in CheckpointWriteDelay(). The checkpointer shouldn't ignore its latch. Other backends may be waiting for it to drain the request queue. Hopefully real systems don't have a full queue often, but the condition is reached easily when shared_buffers is small.
This involves defining a new wait event, which will appear in the pg_stat_activity view often due to spread checkpoints. Back-patch only to 14. Even though the problem exists in earlier branches too, it's hard to hit there. In 14 we stopped using signal handlers for latches on Linux, *BSD and macOS, which were previously hiding this problem by interrupting the sleep (though not reliably, as the signal could arrive before the sleep begins; precisely the problem latches address). Reported-by: Andres Freund <and...@anarazel.de> Reviewed-by: Andres Freund <and...@anarazel.de> Discussion: https://postgr.es/m/20220226213942.nb7uvb2pamyu26dj%40alap3.anarazel.de Branch ------ master Details ------- https://git.postgresql.org/pg/commitdiff/5e6368b42ee6d4b59e085301ca7b0e50f37a897b Modified Files -------------- doc/src/sgml/monitoring.sgml | 4 ++++ src/backend/postmaster/checkpointer.c | 8 +++++++- src/backend/utils/activity/wait_event.c | 3 +++ src/include/utils/wait_event.h | 1 + 4 files changed, 15 insertions(+), 1 deletion(-)