At Fri, 19 Jun 2020 08:59:48 +0530, Amit Kapila <amit.kapil...@gmail.com> wrote in > > > > Mmm. pg_walfile_name seems too specialize to > > > > pg_stop_backup(). (pg_walfile_name_offset() returns wrong result for > > > > segment boundaries.) I'm not willing to do that only to follow such > > > > suspicious(?) specification, but surely it would practically be better > > > > doing that. Please find the attached first patch. > > > > > > > > > > It is a little unclear to me how this or any proposed patch will solve > > > the original problem reported by Fujii-San? Basically, the problem > > > arises because we don't have an interlock between when the checkpoint > > > removes the WAL segment and the view tries to acquire the same. Am, I > > > missing something? > > > > I'm not sure, but I don't get the point of blocking WAL segment > > removal until the view is completed. > > > > I am not suggesting to do that. > > > The said columns of the view are > > just for monitoring, which needs an information snapshot seemingly > > taken at a certain time. And InvalidateObsoleteReplicationSlots kills > > walsenders using lastRemovedSegNo of a different time. The two are > > independent each other. > > > > Also the patch changes min_safe_lsn to show an LSN at segment boundary > > + 1. > > > > But aren't we doing last_removed_seg+1 even without the patch? See code below > > - { > - XLogRecPtr min_safe_lsn; > - > - XLogSegNoOffsetToRecPtr(last_removed_seg + 1, 0, > - wal_segment_size, min_safe_lsn);
It is at the beginning byte of the *next* segment. Fujii-san told that it should be the next byte of it, namely "XLogSegNoOffsetToRecPtr(last_removed_seg + 1, *1*,", and the patch calculates as that. It adds the follows instead. + if (max_slot_wal_keep_size_mb >= 0 && last_removed_seg != 0) + XLogSegNoOffsetToRecPtr(last_removed_seg + 1, 1, + wal_segment_size, min_safe_lsn); regards. -- Kyotaro Horiguchi NTT Open Source Software Center