On 6 July 2017 at 15:29, Jim Finnerty <jfinn...@amazon.com> wrote: > > Feel free to knock down this 'straw man' and propose something better!
I think the pattern in this design that we don't want is that it imposes extra complexity on every user of every page even when the page doesn't have the problem and even when the problem isn't anywhere in the database. Even years from now when this problem is long gone you'll have code paths for dealing with this special page format that are rarely executed and never tested that will have to be maintained blind. Ideally a solution to this problem that imposes a cost only on the weird pages and only temporarily and leave the database in a "consistent" state that doesn't require any special processing when reading the data would be better. The "natural" solution is what was discussed for incompatible page format changes in the past where there's an point release of one Postgres version that tries to ensure there's enough space on the page for the next version and keeps track of whether there are any problematic pages. Then you would be blocked from upgrading until you had ensured all pages had space (presumably by running some special "vacuum upgrade" or something like that). Incidentally it's somewhat intriguing to think about what would happen if we *always* did such a tombstone for deletes. Or perhaps only when it's a full_page_write. Since the whole page is going into the log and that tuple will never be modified again you could imagine just replacing the tuple with the LSN of the deletion and letting anyone who really needs it fetch it from the xlog. That would be a completely different model from the way Postgres works though. More like a log-structured storage system. -- greg -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers