On Fri, Mar 6, 2026 at 4:13 PM Shinya Kato <[email protected]> wrote: > > On Mon, Mar 2, 2026 at 11:44 PM Fujii Masao <[email protected]> wrote: > > With the patch applied, I set up a logical replication and inserted a row > > every > > second. Even with continuous inserts, NULL was shown in the lag columns of > > pg_stat_replication. That makes me wonder whether the patch's approach is > > sufficient to address the issue. > > Thank you for the review and testing! I had only considered the issue > in the context of physical replication, but as you pointed out, my > approach is insufficient for logical replication. > > > Relying solely on replies from the standby or subscriber seems a bit > > fragile to > > me. If the goal is to keep showing the last measured lag for some time, > > perhaps we should introduce a rate limit on when NULL is displayed in the > > lag > > columns? > > My primary goal was to ensure that the source code comments match the > actual behavior, as the comment stating "the second such message must > result from wal_receiver_status_interval expiring on the standby" is > inaccurate. However, as you noted, the patch alone is not sufficient > to fully address the issue. > > > For example, if there has been no activity (i.e., sentPtr == applyPtr and > > applyPtr has not changed since the previous cycle) for, say, 10 seconds, > > then we could allow NULL to be shown. Thought? > > I considered a time-based rate limit, but it is difficult to choose an > appropriate threshold. Furthermore, the walsender has no way of > knowing the standby's or subscriber's wal_receiver_status_interval > setting. > > The attached v2 patch takes a different approach: it additionally > requires that all reported positions (write/flush/apply) remain > unchanged from the previous reply. This directly detects a truly idle > system without relying on timeouts—if any position has advanced, new > WAL activity must have occurred, so we should not clear the lag values > even if the lag tracker is empty.
This approach looks good to me. One comment: currently, the lag becomes NULL basically after about one wal_receiver_status_interval during periods of no activity. OTOH, with this approach, it seems it would take about twice wal_receiver_status_interval. Is this understanding correct? Regards, -- Fujii Masao
