Hi, Patch v7 modifies CompactCheckpointerRequestQueue() to process requests incrementally in batches of CKPT_REQ_BATCH_SIZE (10,000), similar to the approach used in AbsorbSyncRequests(). This limits memory usage from O(num_requests) to O(batch_size) for both hash tables and skip arrays.
- Hash table memory bounded by batch size regardless of total queue size - Skip array allocation limited to batch size instead of max_requests - Prevents potential OOM conditions with very large request queues Trade-offs Cross-batch duplicate detection: The incremental approach won't detect duplicates spanning batch boundaries. This limitation seems acceptable since: - The main issue need to be addressed is preventing memory allocation failures - Remaining duplicates are still handled by the RememberSyncRequest() function in the sync subsystem - The purpose of this function is to make some rooms for new requests not remove all duplicates. Lock holding Duration Andres pointed out[1] that compacting a very large queue takes considerable time, and holding the exclusive lock for an extended period makes it much more likely that backends will have to perform syncs themselves - which is exactly what CompactCheckpointerRequestQueue() is trying to avoid in the first place. However, releasing the lock between batches would introduce race conditions that would make the design much more complex. Given that the primary goal of this patch is to avoid large memory allocations, I keep the lock held for the whole function for simplicity now. [1] https://postgrespro.com/list/id/bjno37ickfafixkqmd2lcyopsajnuig5mm4rg6tn2ackpqyiba@w3sjfo3usuos Best regards, Xuneng
v7-0001-Process-sync-requests-incrementally-in-AbsorbSync.patch
Description: Binary data