Hello Hackers! I will be working on reducing pg_stat_statements LWLock contention [1] for GSoC 2026. Over the past several weeks I have been engaging with my mentors, Kirk, Nik, Andreas and Andrei, and wanted to introduce myself, share what I have learned so far, and get feedback from the broader community.
*About me* I am a sophomore CS student at UMass Amherst, interning at Deep Infra where I profile and optimize inference engines (vLLM, SGLang, TensorRT-LLM) under high concurrency. My systems background is mostly C++ through competitive programming; this project is my path into production C and PostgreSQL internals. *The problem* pg_stat_statements serializes every backend through a single LWLock (pgss->lock). Structural operations (new entry insertion, deallocation, reset) require an exclusive lock that blocks the entire cluster. The deallocation path is the main offender: entry_dealloc() holds the exclusive lock through an O(n log n) sort over all entries. Borodin's 2022 incident report [2] is the worst documented case; it froze an entire production database. *What I have done* I built PostgreSQL 18.3 and 19devel from source (--enable-debug --enable-cassert --enable-injection-points) and benchmarked the contention. At max=100 with 1000 distinct query forms, 90-100% of active backends block on LWLock|pg_stat_statements during deallocation churn, with entry_dealloc() firing ~30,000 times in 60 seconds. PG19 holds the exclusive lock for less time per acquisition (~40% less overhead than PG18) but the bottleneck pattern is identical. I also reproduced the same stall from periodic pg_stat_statements_reset() calls. Through back-and-forth with the mentors, my initial conditional-skip design evolved into a pending-entry queue, specifically to avoid the silent data loss that got Borodin's 2022 patch rejected. *Proposed approach* Two core deliverables: (1) a pending-entry queue using LWLockConditionalAcquire(), with queued/dropped counters in pg_stat_statements_info for observability; and (2) restructuring entry_dealloc() to sort outside the exclusive lock, reducing exclusive hold time from O(n log n) to O(n). Both ship independently. Two stretch goals are scoped as post-core exploration: lock separation between structural changes and counter updates, and an optimized reset path. The full proposal is attached. I would especially welcome feedback on queue-full behavior under sustained contention and on concurrent dealloc edge cases during the snapshot-to-eviction window. I know work like this rarely lands in a single summer. I plan to stay engaged past August and leave any unfinished patches in shape for others to carry forward. ----- [1] wiki.postgresql.org/wiki/GSoC_2026#Monitoring_Tools_Performance:_pg_stat_statements_and_LWLock_Contention [2] postgresql.org/message-id/[email protected] Best regards, Quan Hoang Truong
LWLock_Contention_Proposal.pdf
Description: Adobe PDF document
