On Thu, Mar 16, 2017 at 12:07 PM, Simon Riggs <si...@2ndquadrant.com> wrote: > There are two ways of knowing the lag: 1) by measurement/sampling, > which is the main way this patch approaches this, 2) by direct > observation the LSNs match. Both are equally valid ways of > establishing knowledge. Strangely (2) is the only one of those that is > actually precise and yet you say it is bogus. It is actually the > measurements which are approximations of the actual state. > > The reality is that the lag can change dis-continuously between zero > and non-zero. I don't think we should hide that from people. > > I suspect that your "entirely bogus" feeling comes from the point that > we actually have 3 states, one of which has unknown lag. > > A) "Currently caught-up" > WALSender LSN == WALReceiver LSN (info type (1)) > At this point the current lag is known precisely to be zero. > > B) "Work outstanding, no reply yet" > Immediately after where WALSenderLSN > WALReceiverLSN, yet we haven't > yet received new reply > We expect to stay in this state for however long it takes to receive a > reply, which could be wal_receiver_status_interval or longer if the > lag is greater. At this point we have no measurement of what the lag > is. We could reply NULL since we don't know. We could reply with the > last measured lag when we were last in state C, but if the new reply > was delayed for more than that we'd need to reply that the lag is at > least as high as the delay since last time we left state A. > > C) "Continuous flow" > WALSenderLSN > WALReceiverLSN and we have received a reply > (measurement, info type (2)) > This is the main case. Easy-ish! > > So I think we need to first agree that A and B states exist and how to > report lag in each state.
I agree that these states exist, but we disagree on what 'lag' really means, or, rather, which of several plausible definitions would be the most useful here. My proposal is that the *_lag columns should always report how long it took for recently written, flushed and applied WAL to be written, flushed and applied (and for the primary to know about it). By this definition, sent LSN = applied LSN is not a special case: we simply report how long that LSN took to be written, flushed and applied. Your proposal is that the *_lag columns should report how far in the past the standby is at each of the three stages with respect to the current end of WAL. By this definition when sent LSN = applied LSN we are currently in the 'A' state meaning 'caught up' and should show 00:00:00. Here are two reasons I prefer my definition: * you can trivially convert from my definition to yours on the basis of existing information: CASE WHEN sent_location = replay_location THEN '00:00:00'::interval ELSE replay_lag END, but there is no way to get from your definition to mine * lag numbers reported using my definition tell you how long each of the synchronous replication levels take, but with your definition they only do that if you catch them during times when they aren't showing the special case 00:00:00; a fast standby running any workload other than a benchmark is often going to show all-caught-up 00:00:00 so the new columns will be useless for that purpose -- Thomas Munro http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers