On Tue, Jan 19, 2021 at 01:48:31PM +0900, Kyotaro Horiguchi wrote: > At Mon, 18 Jan 2021 17:30:22 +0900 (JST), Kyotaro Horiguchi > <horikyota....@gmail.com> wrote in > > At Sun, 17 Jan 2021 23:02:18 -0800, Noah Misch <n...@leadboat.com> wrote in > > > On Sun, Jan 17, 2021 at 10:36:31PM -0800, Noah Misch wrote: > > > > I wrote the above based on the "PageGetLSN(page) > (snapshot)->lsn" > > > > check in > > > > TestForOldSnapshot(). If the LSN isn't important, what else explains > > > > RelationAllowsEarlyPruning() checking RelationNeedsWAL()? > > > > > > Thinking about it more, some RelationAllowsEarlyPruning() callers need > > > different treatment. Above, I was writing about the case of deciding > > > whether > > > to do early pruning. The other RelationAllowsEarlyPruning() call sites > > > are > > > deciding whether the relation might be lacking old data. For that case, > > > we > > > should check RELPERSISTENCE_PERMANENT, not RelationNeedsWAL(). > > > Otherwise, we > > > could fail to report an old-snapshot error in a case like this: > > > > A> > setup: CREATE TABLE t AS SELECT ...; > B> > xact1: BEGIN ISOLATION LEVEL REPEATABLE READ; SELECT 1; -- take snapshot > C> > xact2: DELETE FROM t; > D> > (plenty of time passes) > E> > xact3: SELECT count(*) FROM t; -- early pruning > F> > xact1: SAVEPOINT q; SELECT count(*) FROM t; ROLLBACK TO q; -- "snapshot > too old" > G> > xact1: ALTER TABLE t SET TABLESPACE something; -- start skipping WAL > H> > xact1: SELECT count(*) FROM t; -- no error, wanted "snapshot too old" > > > > > > Is that plausible? > > > > Thank you for the consideration and yes. But I get "snapshot too old" > > from the last query with the patched version. maybe I'm missing > > something. I'm going to investigate the case. > > Ah. I took that in reverse way. (caught by the discussion on page > LSN.) Ok, the "current" code works that way. So we need to perform the > check the new way in RelationAllowsEarlyPruning. So in a short answer > for the last question in regard to the reference side is yes.
Right. I expect the above procedure shows a bug in git master, but your patch versions suffice to fix that bug. > I understand that you are suggesting that at least > TransactionIdLimitedForOldSnapshots should follow not only relation > persistence but RelationNeedsWAL, based on the discussion on the > check on LSN of TestForOldSnapshot(). > > I still don't think that LSN in the WAL-skipping-created relfilenode > harm. ALTER TABLE SET TABLESPACE always makes a dead copy of every > block (except checksum) including page LSN regardless of wal_level. In > the scenario above, the last select at H correctly reads page LSN set > by E then copied by G, which is larger than the snapshot LSN at B. So > doesn't go the fast-return path before actual check performed by > RelationAllowsEarlyPruning. I agree the above procedure doesn't have a problem under your patch versions. See https://postgr.es/m/20210116043816.ga1644...@rfd.leadboat.com for a different test case. In more detail: setup: CREATE TABLE t AS SELECT ...; xact1: BEGIN; DELETE FROM t; xact2: BEGIN ISOLATION LEVEL REPEATABLE READ; SELECT 1; -- take snapshot xact1: COMMIT; (plenty of time passes, rows of t are now eligible for early pruning) xact3: BEGIN; xact3: ALTER TABLE t SET TABLESPACE something; -- start skipping WAL xact3: SELECT count(*) FROM t; -- early pruning w/o WAL, w/o LSN changes xact3: COMMIT; xact2: SELECT count(*) FROM t; -- no error, wanted "snapshot too old" I expect that shows no bug for git master, but I expect it does show a bug with your patch versions. Could you try implementing both test procedures in src/test/modules/snapshot_too_old? There's no need to make the test use wal_level=minimal explicitly; it's sufficient to catch these bugs when wal_level=minimal is in the TEMP_CONFIG file.