Re: RFC: Detailed reorder buffer stats dumps

2021-06-29 Thread Craig Ringer
On Thu, 6 May 2021 at 19:40, Amit Kapila  wrote:

> On Thu, May 6, 2021 at 9:54 AM Craig Ringer  wrote:
> >

> At the least it'd be helpful to have pg_stat_replication (or a new
> related auxiliary view like pg_stat_logical_decoding) show
> >
> > - walsender total bytes sent this session
> > - number of txns processed this txn
> >
>
> You might be able to derive some of the above sorts of stats from the
> newly added pg_stat_replication_slots [1].
>
>
That's a huge improvement that I managed to totally miss the discussion of
and work on. Thanks. It'll be a significant help.
'

> > - number txns filtered out by output plugin this session
> > - oldest xid in reorder buffer
> > - reorder buffer number of txns
> > - reorder buffer total size (in-memory and total inc spilled)
> > - reorderbuffercommit current xid, last change lsn, total buffered size
> of current tx, total bytes of buffer processed so far within the current
> txn, and commit lsn if known, only when currently streaming a txn from
> reorderbuffercommit
>

These are less statistical in nature, and more about the current activity
of the walsender and logical decoding state. I'm not sure if it'd make much
sense to tack them on to pg_stat_replication_slots, especially as that'd
also mean they were quite delayed.

But it probably isn't worth the effort of exposing this info in a new view.

With the higher level info now available in pg_stat_replication_slots, I
think I might look into exposing these finer details via trace markers for
use with perf / systemtap  / etc instead.

A digression, but: It's a real shame that such tools don't give us a way to
read specific tagged regions of memory with the same ease they let us probe
function calls though. You generally need gdb to read the value of a
global, or a moderately funky systemtap script. There's no convenient
equivalent to SDT markers (TRACE_FOO) to tag variables. Wouldn't it be nice
if we could

 perf watch postgres:walsender_reorderbuffer_oldest_xid

or something like that?


Re: RFC: Detailed reorder buffer stats dumps

2021-05-06 Thread Amit Kapila
On Thu, May 6, 2021 at 9:54 AM Craig Ringer  wrote:
>
> On Thu, 6 May 2021 at 02:28, Andres Freund  wrote:
>>
>> Hi,
>>
>> On 2021-05-05 18:33:27 +0800, Craig Ringer wrote:
>> > I'm thinking of piggy-backing on the approach used in the "Get memory
>> > contexts of an arbitrary backend process" patch in order to provide access
>> > to detailed reorder buffer content statistics from walsenders on request.
>> >
>> > Right now the reorder buffer is mostly a black-box. I mostly rely on gdb or
>> > on dynamic probes (perf, systemtap) to see what it's doing. I intend a
>> > patch soon to add a couple of fields to struct WalSnd to report some very
>> > coarse reorder buffer stats - at least oldest buffered xid, number of
>> > buffered txns, total bytes of buffered txns in memory, total bytes of
>> > buffered txns spilled to disk.
>> >
>> > But sometimes what I really want is details on the txns that're in the
>> > reorder buffer, and that's not feasible to export via always-enabled
>> > reporting like struct WalSnd. So I'm thinking that the same approach used
>> > for the memory context stats patch might work well for asking the walsender
>> > for a detailed dump of reorder buffer contents. Something like a
>> > per-buffered-txn table of txn topxid, start-lsn, most recent change lsn,
>> > number of changes, number of subxids, number of invalidations, number of
>> > catalog changes, buffer size in memory, buffer size spilled to disk.
>> >
>> > Anyone drastically opposed to the idea?
>>
>> I am doubtful. The likelihood of ending with effectively unused code
>> seems very substantial here.
>
>
> I can't rule that out, but the motivation for this proposal isn't development 
> convenience. It's production support and operations. The reorder buffer is a 
> black box right now, and when you're trying to answer the questions "what is 
> the walsender doing," "is meaningful progress being made," and "what is 
> slowing down replication" it's ... not easy.
>
> I currently rely on some fairly hairy gdb scripts, which I'm not keen on 
> running on production systems if I can avoid it.
>
> I'm far from set on the approach of asking the walsender to dump a reorder 
> buffer state summary to a file. But I don't think the current state of 
> affairs is much fun for production use. Especially since we prevent the 
> pg_stat_replication sent_lsn from going backwards, so reorder buffering can 
> cause replication to appear to completely cease to progress for long periods 
> unless you identify the socket and monitor traffic on it, or you intrude on 
> the process with gdb.
>
> At the least it'd be helpful to have pg_stat_replication (or a new related 
> auxiliary view like pg_stat_logical_decoding) show
>
> - walsender total bytes sent this session
> - number of txns processed this txn
>

You might be able to derive some of the above sorts of stats from the
newly added pg_stat_replication_slots [1].

> - number txns filtered out by output plugin this session
> - oldest xid in reorder buffer
> - reorder buffer number of txns
> - reorder buffer total size (in-memory and total inc spilled)
> - reorderbuffercommit current xid, last change lsn, total buffered size of 
> current tx, total bytes of buffer processed so far within the current txn, 
> and commit lsn if known, only when currently streaming a txn from 
> reorderbuffercommit
>
> That way it'd be possible to tell if a logical walsender is currently 
> processing a commit and get a much better sense of its progress within the 
> commit.
>
> Perhaps output plugins could do some of this and expose their own custom 
> views. But then each plugin would have to add its own. Plus they don't get a 
> particularly good view into the reorder buffer state; they'd have a hard time 
> maintaining good running stats.
>
> Some basic monitoring exposed for logical decoding and reorder buffering 
> would help a lot. Does that sound more palatable?
>

Can't we think of enhancing existing views or introduce a new view to
provide such information?

[1] - 
https://www.postgresql.org/docs/devel/monitoring-stats.html#MONITORING-PG-STAT-REPLICATION-SLOTS-VIEW

-- 
With Regards,
Amit Kapila.




Re: RFC: Detailed reorder buffer stats dumps

2021-05-05 Thread Craig Ringer
On Thu, 6 May 2021 at 02:28, Andres Freund  wrote:

> Hi,
>
> On 2021-05-05 18:33:27 +0800, Craig Ringer wrote:
> > I'm thinking of piggy-backing on the approach used in the "Get memory
> > contexts of an arbitrary backend process" patch in order to provide
> access
> > to detailed reorder buffer content statistics from walsenders on request.
> >
> > Right now the reorder buffer is mostly a black-box. I mostly rely on gdb
> or
> > on dynamic probes (perf, systemtap) to see what it's doing. I intend a
> > patch soon to add a couple of fields to struct WalSnd to report some very
> > coarse reorder buffer stats - at least oldest buffered xid, number of
> > buffered txns, total bytes of buffered txns in memory, total bytes of
> > buffered txns spilled to disk.
> >
> > But sometimes what I really want is details on the txns that're in the
> > reorder buffer, and that's not feasible to export via always-enabled
> > reporting like struct WalSnd. So I'm thinking that the same approach used
> > for the memory context stats patch might work well for asking the
> walsender
> > for a detailed dump of reorder buffer contents. Something like a
> > per-buffered-txn table of txn topxid, start-lsn, most recent change lsn,
> > number of changes, number of subxids, number of invalidations, number of
> > catalog changes, buffer size in memory, buffer size spilled to disk.
> >
> > Anyone drastically opposed to the idea?
>
> I am doubtful. The likelihood of ending with effectively unused code
> seems very substantial here.
>

I can't rule that out, but the motivation for this proposal isn't
development convenience. It's production support and operations. The
reorder buffer is a black box right now, and when you're trying to answer
the questions "what is the walsender doing," "is meaningful progress being
made," and "what is slowing down replication" it's ... not easy.

I currently rely on some fairly hairy gdb scripts, which I'm not keen on
running on production systems if I can avoid it.

I'm far from set on the approach of asking the walsender to dump a reorder
buffer state summary to a file. But I don't think the current state of
affairs is much fun for production use. Especially since we prevent the
pg_stat_replication sent_lsn from going backwards, so reorder buffering can
cause replication to appear to completely cease to progress for long
periods unless you identify the socket and monitor traffic on it, or you
intrude on the process with gdb.

At the least it'd be helpful to have pg_stat_replication (or a new related
auxiliary view like pg_stat_logical_decoding) show

- walsender total bytes sent this session
- number of txns processed this txn
- number txns filtered out by output plugin this session
- oldest xid in reorder buffer
- reorder buffer number of txns
- reorder buffer total size (in-memory and total inc spilled)
- reorderbuffercommit current xid, last change lsn, total buffered size of
current tx, total bytes of buffer processed so far within the current txn,
and commit lsn if known, only when currently streaming a txn from
reorderbuffercommit

That way it'd be possible to tell if a logical walsender is currently
processing a commit and get a much better sense of its progress within the
commit.

Perhaps output plugins could do some of this and expose their own custom
views. But then each plugin would have to add its own. Plus they don't get
a particularly good view into the reorder buffer state; they'd have a hard
time maintaining good running stats.

Some basic monitoring exposed for logical decoding and reorder buffering
would help a lot. Does that sound more palatable?

If so, I'd probably still want to be able to hook a few places in logical
decoding to allow an extension to instrument it when greater insight into
the inner workings is required.

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 2ndQuadrant - PostgreSQL Solutions for the Enterprise


Re: RFC: Detailed reorder buffer stats dumps

2021-05-05 Thread Andres Freund
Hi,

On 2021-05-05 18:33:27 +0800, Craig Ringer wrote:
> I'm thinking of piggy-backing on the approach used in the "Get memory
> contexts of an arbitrary backend process" patch in order to provide access
> to detailed reorder buffer content statistics from walsenders on request.
> 
> Right now the reorder buffer is mostly a black-box. I mostly rely on gdb or
> on dynamic probes (perf, systemtap) to see what it's doing. I intend a
> patch soon to add a couple of fields to struct WalSnd to report some very
> coarse reorder buffer stats - at least oldest buffered xid, number of
> buffered txns, total bytes of buffered txns in memory, total bytes of
> buffered txns spilled to disk.
> 
> But sometimes what I really want is details on the txns that're in the
> reorder buffer, and that's not feasible to export via always-enabled
> reporting like struct WalSnd. So I'm thinking that the same approach used
> for the memory context stats patch might work well for asking the walsender
> for a detailed dump of reorder buffer contents. Something like a
> per-buffered-txn table of txn topxid, start-lsn, most recent change lsn,
> number of changes, number of subxids, number of invalidations, number of
> catalog changes, buffer size in memory, buffer size spilled to disk.
> 
> Anyone drastically opposed to the idea?

I am doubtful. The likelihood of ending with effectively unused code
seems very substantial here.

Greetings,

Andres Freund




Re: RFC: Detailed reorder buffer stats dumps

2021-05-05 Thread Ashutosh Bapat
On Wed, May 5, 2021 at 4:03 PM Craig Ringer  wrote:
>
> Hi all
>
> I'm thinking of piggy-backing on the approach used in the "Get memory 
> contexts of an arbitrary backend process" patch in order to provide access to 
> detailed reorder buffer content statistics from walsenders on request.
>
> Right now the reorder buffer is mostly a black-box. I mostly rely on gdb or 
> on dynamic probes (perf, systemtap) to see what it's doing. I intend a patch 
> soon to add a couple of fields to struct WalSnd to report some very coarse 
> reorder buffer stats - at least oldest buffered xid, number of buffered txns, 
> total bytes of buffered txns in memory, total bytes of buffered txns spilled 
> to disk.
>
> But sometimes what I really want is details on the txns that're in the 
> reorder buffer, and that's not feasible to export via always-enabled 
> reporting like struct WalSnd. So I'm thinking that the same approach used for 
> the memory context stats patch might work well for asking the walsender for a 
> detailed dump of reorder buffer contents. Something like a per-buffered-txn 
> table of txn topxid, start-lsn, most recent change lsn, number of changes, 
> number of subxids, number of invalidations, number of catalog changes, buffer 
> size in memory, buffer size spilled to disk.
>

+1.

Will be really useful to troubleshoot what's going on in the ReorderBuffer.

If we put that in WalSnd it will not be useful for the connections
which are using normal backends to get logical changes through
built-in functions.
-- 
Best Wishes,
Ashutosh Bapat




RFC: Detailed reorder buffer stats dumps

2021-05-05 Thread Craig Ringer
Hi all

I'm thinking of piggy-backing on the approach used in the "Get memory
contexts of an arbitrary backend process" patch in order to provide access
to detailed reorder buffer content statistics from walsenders on request.

Right now the reorder buffer is mostly a black-box. I mostly rely on gdb or
on dynamic probes (perf, systemtap) to see what it's doing. I intend a
patch soon to add a couple of fields to struct WalSnd to report some very
coarse reorder buffer stats - at least oldest buffered xid, number of
buffered txns, total bytes of buffered txns in memory, total bytes of
buffered txns spilled to disk.

But sometimes what I really want is details on the txns that're in the
reorder buffer, and that's not feasible to export via always-enabled
reporting like struct WalSnd. So I'm thinking that the same approach used
for the memory context stats patch might work well for asking the walsender
for a detailed dump of reorder buffer contents. Something like a
per-buffered-txn table of txn topxid, start-lsn, most recent change lsn,
number of changes, number of subxids, number of invalidations, number of
catalog changes, buffer size in memory, buffer size spilled to disk.

Anyone drastically opposed to the idea?

(I know I have to finish up with the LWLock tracepoint patchset first, this
is a RFC at this stage).

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 2ndQuadrant - PostgreSQL Solutions for the Enterprise