Josh Berkus <j...@agliodbs.com> writes: > As I understand it, we don't currently have any mechanism in Postgres > which would cause allocated-but-empty pages.
That's not correct: the situation can easily arise after a database crash. (The scenario is that we've done smgrextend to add the first page to the file, but not yet completed or WAL-logged insertion of any data into it. This leaves us with an empty, all-zero page that will be ignored until we next want to add some data to the table.) The core problem here is that file extension is not a transactional operation, because it doesn't roll back on crash. The current matview design gets around this problem by requiring that transition between scannable and unscannable states involve a complete table rewrite, and thus the transactionality issue can be hidden behind a transactional update of the matview's pg_class.relfilenode field. IMO, that is obviously a dead-end design, because we are going to want scannability status updates associated with partial updates of the matview's contents. So Kevin's summary is leaving out one key desirable property: (4) ability to change scannability state without a full table rewrite. Putting the state into pg_class would preserve that property. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers