On Thu, Sep 25, 2025 at 3:17 AM Bertrand Drouvot
<bertranddrouvot...@gmail.com> wrote:
>
> Hi,
>
> On Wed, Sep 24, 2025 at 10:11:20AM -0700, Masahiko Sawada wrote:
> > On Tue, Sep 23, 2025 at 11:31 PM Bertrand Drouvot
> > <bertranddrouvot...@gmail.com> wrote:
> >
> > Thank you for updating the patch! Here are some comments:
> >
> > ---
> > +   bool        memory_limit_reached = (rb->size >=
> > logical_decoding_work_mem * (Size) 1024);
> > +
> > +   if (memory_limit_reached)
> > +       rb->memExceededCount += 1;
> >
> > Do we want to use 'exceeded' for the variable too for better consistency?
>
> I thought about it, but since we use ">=" I think that "reached" is more
> accurate. So I went for "reached" for this one and "exceeded" for "user 
> facing"
> ones. That said I don't have a strong opinion about it, and I'd be ok to use
> "exceeded" if you feel strong about it.

Agreed with the current style. Thank you for the explanation.

>
> > ---
> > One thing I want to clarify is that even if the memory usage exceeds
> > the logical_decoding_work_mem it doesn't necessarily mean we serialize
> > or stream transactions because of
> > ReorderBufferCheckAndTruncateAbortedTXN().
>
> Right.
>
> > For example, in a situation
> > where many large already-aborted transactions are truncated by
> > transactionsReorderBufferCheckAndTruncateAbortedTXN(), users would see
> > a high number in mem_exceeded_count column but it might not actually
> > require any adjustment for logical_decoding_work_mem.
>
> Yes, but in that case mem_exceeded_count would be high compared to spill_txns,
> stream_txns, no?

Right. I think only mem_exceeded_count has a high number while
spill_txns and stream_txns have lower numbers in this case (like you
shown in your first example below).

>
> > One idea is to
> > increment that counter if exceeding memory usage is caused to
> > serialize or stream any transactions. On the other hand, it might make
> > sense and be straightforward too to show a pure statistic that the
> > memory usage exceeded the logical_decoding_work_mem. What do you
> > think?
>
> The new counter, as it is proposed, helps to see if the workload hits the
> logical_decoding_work_mem frequently or not. I think it's valuable information
> to have on its own.
>
> Now to check if logical_decoding_work_mem needs adjustment, one could compare
> mem_exceeded_count with the existing spill_txns and stream_txns.
>
> For example, If I abort 20 transactions that exceeded 
> logical_decoding_work_mem
> , I'd get:
>
> postgres=# select spill_txns,stream_txns,mem_exceeded_count from 
> pg_stat_replication_slots ;
>  spill_txns | stream_txns | mem_exceeded_count
> ------------+-------------+--------------------
>           0 |           0 |                 20
> (1 row)
>
> That way I could figure out that mem_exceeded_count has been reached for
> aborted transactions.
>
> OTOH, If one see spill_txns + stream_txns close to mem_exceeded_count, like:
>
> postgres=# select spill_txns,stream_txns,mem_exceeded_count from 
> pg_stat_replication_slots ;
>  spill_txns | stream_txns | mem_exceeded_count
> ------------+-------------+--------------------
>          38 |          20 |                 58
> (1 row)
>
> That probably means that mem_exceeded_count would need to be increased.
>
> What do you think?

Right. But one might argue that if we increment mem_exceeded_count
only when serializing or streaming is actually performed,
mem_exceeded_count would be 0 in the first example and therefore users
would be able to simply check mem_exceeded_count without any
computation.

>
> BTW, while doing some tests for the above examples, I noticed that the patch
> was missing a check on memExceededCount in UpdateDecodingStats() (that 
> produced
> mem_exceeded_count being 0 for one of the new test in test_decoding): Fixed in
> v5 attached.

Thank you for updating the patch!

Regards,

-- 
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com


Reply via email to