On Thu, Sep 25, 2025 at 3:17 AM Bertrand Drouvot <bertranddrouvot...@gmail.com> wrote: > > Hi, > > On Wed, Sep 24, 2025 at 10:11:20AM -0700, Masahiko Sawada wrote: > > On Tue, Sep 23, 2025 at 11:31 PM Bertrand Drouvot > > <bertranddrouvot...@gmail.com> wrote: > > > > Thank you for updating the patch! Here are some comments: > > > > --- > > + bool memory_limit_reached = (rb->size >= > > logical_decoding_work_mem * (Size) 1024); > > + > > + if (memory_limit_reached) > > + rb->memExceededCount += 1; > > > > Do we want to use 'exceeded' for the variable too for better consistency? > > I thought about it, but since we use ">=" I think that "reached" is more > accurate. So I went for "reached" for this one and "exceeded" for "user > facing" > ones. That said I don't have a strong opinion about it, and I'd be ok to use > "exceeded" if you feel strong about it.
Agreed with the current style. Thank you for the explanation. > > > --- > > One thing I want to clarify is that even if the memory usage exceeds > > the logical_decoding_work_mem it doesn't necessarily mean we serialize > > or stream transactions because of > > ReorderBufferCheckAndTruncateAbortedTXN(). > > Right. > > > For example, in a situation > > where many large already-aborted transactions are truncated by > > transactionsReorderBufferCheckAndTruncateAbortedTXN(), users would see > > a high number in mem_exceeded_count column but it might not actually > > require any adjustment for logical_decoding_work_mem. > > Yes, but in that case mem_exceeded_count would be high compared to spill_txns, > stream_txns, no? Right. I think only mem_exceeded_count has a high number while spill_txns and stream_txns have lower numbers in this case (like you shown in your first example below). > > > One idea is to > > increment that counter if exceeding memory usage is caused to > > serialize or stream any transactions. On the other hand, it might make > > sense and be straightforward too to show a pure statistic that the > > memory usage exceeded the logical_decoding_work_mem. What do you > > think? > > The new counter, as it is proposed, helps to see if the workload hits the > logical_decoding_work_mem frequently or not. I think it's valuable information > to have on its own. > > Now to check if logical_decoding_work_mem needs adjustment, one could compare > mem_exceeded_count with the existing spill_txns and stream_txns. > > For example, If I abort 20 transactions that exceeded > logical_decoding_work_mem > , I'd get: > > postgres=# select spill_txns,stream_txns,mem_exceeded_count from > pg_stat_replication_slots ; > spill_txns | stream_txns | mem_exceeded_count > ------------+-------------+-------------------- > 0 | 0 | 20 > (1 row) > > That way I could figure out that mem_exceeded_count has been reached for > aborted transactions. > > OTOH, If one see spill_txns + stream_txns close to mem_exceeded_count, like: > > postgres=# select spill_txns,stream_txns,mem_exceeded_count from > pg_stat_replication_slots ; > spill_txns | stream_txns | mem_exceeded_count > ------------+-------------+-------------------- > 38 | 20 | 58 > (1 row) > > That probably means that mem_exceeded_count would need to be increased. > > What do you think? Right. But one might argue that if we increment mem_exceeded_count only when serializing or streaming is actually performed, mem_exceeded_count would be 0 in the first example and therefore users would be able to simply check mem_exceeded_count without any computation. > > BTW, while doing some tests for the above examples, I noticed that the patch > was missing a check on memExceededCount in UpdateDecodingStats() (that > produced > mem_exceeded_count being 0 for one of the new test in test_decoding): Fixed in > v5 attached. Thank you for updating the patch! Regards, -- Masahiko Sawada Amazon Web Services: https://aws.amazon.com