Hi, On 2020-04-16 12:41:55 -0400, Robert Haas wrote: > I'm starting a new thread for this, because the recent discussion of > problems with old_snapshot_threshold[1] touched on a lot of separate > issues, and I think it will be too confusing if we discuss all of them > on one thread. Attached are three patches.
Cool. > 0003 attempts to fix bugs in MaintainOldSnapshotTimeMapping() so that > it produces a sensible mapping. I encountered and tried to fix two > issues here: > > First, as previously discussed, the branch that advances the mapping > should not categorically do "oldSnapshotControl->head_timestamp = ts;" > assuming that the head_timestamp is supposed to be the timestamp for > the oldest bucket rather than the newest one. Rather, there are three > cases: (1) resetting the mapping resets head_timestamp, (2) extending > the mapping by an entry without dropping an entry leaves > head_timestamp alone, and (3) overwriting the previous head with a new > entry advances head_timestamp by 1 minute. > Second, the calculation of the number of entries by which the mapping > should advance is incorrect. It thinks that it should advance by the > number of minutes between the current head_timestamp and the incoming > timestamp. That would be correct if head_timestamp were the most > recent entry in the mapping, but it's actually the oldest. As a > result, without this fix, every time we move into a new minute, we > advance the mapping much further than we actually should. Instead of > advancing by 1, we advance by the number of entries that already exist > in the mapping - which means we now have entries that correspond to > times which are in the future, and don't advance the mapping again > until those future timestamps are in the past. > > With these fixes, I seem to get reasonably sensible mappings, at least > in light testing. I tried running this in one window with \watch 10: > > select *, age(newest_xmin), clock_timestamp() from > pg_old_snapshot_time_mapping(); > > And in another window I ran: > > pgbench -T 300 -R 10 > > And the age does in fact advance by ~600 transactions per minute. I still think we need a way to test this without waiting for hours to hit various edge cases. You argued against a fixed binning of old_snapshot_threshold/100 arguing its too coarse. How about a 1000 or so? For 60 days, the current max for old_snapshot_threshold, that'd be a granularity of 01:26:24, which seems fine. The best way I can think of that'd keep current GUC values sensible is to change old_snapshot_threshold to be float. Ugly, but ...? Greetings, Andres Freund