On Thu, Mar 26, 2026 at 6:20 PM Robert Haas <[email protected]> wrote:
>[..v23]
0003: please be the judge here, as I'm not sure. Isn't there some too high
concurrency hit in pg_get_collected_shared_advice? If I do
pgbench -M extended -c 12 -j 12 -P 1 -S:
progress: 59.0 s, 191008.4 tps, lat 0.063 ms stddev 0.200, 0 failed
progress: 60.0 s, 197571.2 tps, lat 0.061 ms stddev 0.026, 0 failed
progress: 61.0 s, 189825.5 tps, lat 0.063 ms stddev 0.208, 0 failed
progress: 62.0 s, 197082.4 tps, lat 0.061 ms stddev 0.027, 0 failed
progress: 63.0 s, 69345.9 tps, lat 0.173 ms stddev 1.651, 0 failed
progress: 64.0 s, 47243.6 tps, lat 0.251 ms stddev 2.128, 0 failed
progress: 65.0 s, 48211.6 tps, lat 0.247 ms stddev 2.156, 0 failed
there is visible collapse from 190k to 48k tps was due to constant flood
of artificial calls of: select count(*) from pg_get_collected_shared_advice();
The code does LW_SHARED there over potentially lots of of tuplestore_putvalues()
calls. However any other backend does pgca_planner_shutdown()->
pg_collect_advice_save()->store_shared_advice() which is trying to grab
LW_EXCLUSIVE lock, so everything might be be blocked across whole cluster? (I
mean for the duration of tuplestore entry and that seems to even talk about
"tape"/"disk", so to me it looks like prolonged I/O operations for temp might
impact CPU-only planning stuff?)
Maybe it is possible to buffer those reads under LW_SHARED into
backend-only (private)
memory and later just fill tuplestore later to avoid such hazard? (but the
obvious problem is how much memory we can have and how big shared area can
become). Or maybe after some time simply release it and sleep and re-take it?
0004: question, why in the pg_get_advice_stashes() the second call to
dshash_seq_init() nearby "Emit results" is done with exclusive=true , but
apparently only reads it?
-J.