Re: [HACKERS] Measuring replay lag

Simon Riggs Tue, 21 Feb 2017 04:53:21 -0800

On 17 February 2017 at 07:45, Thomas Munro
<thomas.mu...@enterprisedb.com> wrote:
> On Fri, Feb 17, 2017 at 12:45 AM, Simon Riggs <si...@2ndquadrant.com> wrote:
>> Feeling happier about this for now at least.
>
> Thanks!


And happier again, leading me to move to the next stage of review,
focusing on the behaviour emerging from the design.

So my current understanding is that this doesn't rely upon LSN
arithmetic to measure lag, which is good. That means logical
replication should "just work" and future mechanisms to filter
physical WAL will also just work. This is important, so please comment
if you see that isn't the case.

I notice that LagTrackerRead() doesn't do anything to interpolate the
time given, so at present any attempt to prune the lag sample buffer
would result in falsely increasing the lag times reported. Which is
probably the reason why you say "There may be better adaptive
compaction algorithms."  We need to look at this some more, an initial
guess might be that we need to insert fewer samples as the buffer
fills since the LagTrackerRead() algorithm is O(N) on number of
samples and thus increasing the buffer itself isn't a great plan.

It would be very nice to be able to say something like that the +/-
confidence limits of the lag are no more than 50% of the lag time, so
we have some idea of how accurate the value is at any point. We need
to document the accuracy of the result, otherwise we'll be answering
questions on that for some time. So lets think about that now.

Given LagTrackerRead() is reading the 3 positions in order, it seems
sensible to start reading the LAG_TRACKER_FLUSH_HEAD from the place
you finished reading LAG_TRACKER_WRITE_HEAD etc.. Otherwise we end up
doing way too much work with larger buffers.

Which makes me think about the read more. The present design
calculates everything on receipt of standby messages. I think we
should simply record the last few messages and do the lag calculation
when the values are later read, if indeed they are ever read. That
would allow us a much better diagnostic view, at least. And it would
allow you to show a) latest value, b) smoothed in various ways, or c)
detail of last few messages for diagnostics. The latest value would be
the default value in pg_stat_replication - I agree we shouldn't make
that overly wide, so we'd need another function to access the details.

What is critical is that we report stable values as lag increases.
i.e. we need to iron out any usage cases so we don't have to fix them
in PG11 and spend a year telling people "yeh, it does that" (like
we've been doing for some time). So the diagnostics will help us
investigate this patch over various use cases...

I think what we need to show some test results with the graph of lag
over time for these cases:
1. steady state - pgbench on master, so we can see how that responds
2. blocked apply on standby - so we can see how the lag increases but
also how the accuracy goes down as the lag increases and whether the
reported value changes (depending upon algo)
3. burst mode - where we go from not moving to moving at high speed
and then stop again quickly
+other use cases you or others add

Does the proposed algo work for these cases? What goes wrong with it?
It's the examination of these downsides, if any, are the things we
need to investigate now to allow this to get committed.


Some minor points on code...
Why are things defined in walsender.c and not in .h?
Why is LAG_TRACKER_NUM_READ_HEADS not the same as NUM_SYNC_REP_WAIT_MODE?
...and other related constants shouldn't be redefined either.

-- 
Simon Riggs                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Measuring replay lag

Reply via email to