Re: Proposal: Limitations of palloc inside checkpointer

Xuneng Zhou Thu, 26 Jun 2025 06:32:33 -0700

Hi,

Patch v7 modifies CompactCheckpointerRequestQueue() to process requests
incrementally in batches of CKPT_REQ_BATCH_SIZE (10,000), similar to the
approach used in AbsorbSyncRequests(). This limits memory usage from
O(num_requests) to O(batch_size) for both hash tables and skip arrays.



- Hash table memory bounded by batch size regardless of total queue size

- Skip array allocation limited to batch size instead of max_requests

- Prevents potential OOM conditions with very large request queues


Trade-offs

Cross-batch duplicate detection: The incremental approach won't detect
duplicates spanning batch boundaries. This limitation seems acceptable
since:

   - The main issue need to be addressed is preventing memory allocation
failures

   - Remaining duplicates are still handled by the RememberSyncRequest()
function in the sync subsystem

   - The purpose of this function is to make some rooms for new requests
not remove all duplicates.


Lock holding Duration

Andres pointed out[1] that compacting a very large queue takes considerable
time, and holding the exclusive lock for an extended period makes it much
more likely that backends will have to perform syncs themselves - which is
exactly what CompactCheckpointerRequestQueue() is trying to avoid in the
first place. However, releasing the lock between batches would introduce
race conditions that would make the design much more complex. Given that
the primary goal of this patch is to avoid large memory allocations, I keep
the lock held for the whole function for simplicity now.

[1]
https://postgrespro.com/list/id/bjno37ickfafixkqmd2lcyopsajnuig5mm4rg6tn2ackpqyiba@w3sjfo3usuos



Best regards,


Xuneng

v7-0001-Process-sync-requests-incrementally-in-AbsorbSync.patch
Description: Binary data

Re: Proposal: Limitations of palloc inside checkpointer

Reply via email to