as repeatedly previously discussed, our lack of knowledge in which
release a relation has been created / fully scanned prevents us from
reclaiming infomask bits (and similar), and makes debugging harder
because it's unclear how for one has to go back to look for bugs.

I propose that for each pg_class entry we start to keep the following
additional metadata:
- CATALOG_VERSION_NO at relation creation
- PG_VERSION_NUM at relation creation
- CATALOG_VERSION_NO at last full scan by vacuum
- PG_VERSION_NUM at last full scan by vacuum

pg_upgrade would preserve those fields across upgrades.

The 'last full scan' information is useful, because it'd allow us to
refuse pg_upgrade to run until all tables have been fully vacuumed at
least once since some infomask bit has been removed.  That'd e.g. allow
us to reclaim MOVED_OFF/HEAP_MOVED_IN in $version_introduced + 1 -
having to vacuum each table at least once when moving from a version
older than $version_introduced wouldn't be too bad.

I'm suggesting to also keep PG_VERSION_NUM, because that'd allow us to
backport preparation for features into the stable branches, that we
otherwise couldn't introduce, due to on-disk compatibility reasons.  The
lack of this IIRC came up in a number of discussions. Besides that, it'd
have been useful for debugging more than once.

The 'relation creation' information would e.g. have been quite useful
when we were working through the multixact bugs - not knowing in which
release corrupted tuples might have originated has made that harder.

I'd suggest adding such fields into the CATALOG_VARLEN bit of pg_class,
there's no point in having the information in relcache and such.


Andres Freund

Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:

Reply via email to