Josh Berkus <j...@agliodbs.com> writes:
> As I understand it, we don't currently have any mechanism in Postgres
> which would cause allocated-but-empty pages.

That's not correct: the situation can easily arise after a database
crash.  (The scenario is that we've done smgrextend to add the first
page to the file, but not yet completed or WAL-logged insertion of any
data into it.  This leaves us with an empty, all-zero page that will be
ignored until we next want to add some data to the table.)

The core problem here is that file extension is not a transactional
operation, because it doesn't roll back on crash.

The current matview design gets around this problem by requiring that
transition between scannable and unscannable states involve a complete
table rewrite, and thus the transactionality issue can be hidden behind
a transactional update of the matview's pg_class.relfilenode field.
IMO, that is obviously a dead-end design, because we are going to want
scannability status updates associated with partial updates of the
matview's contents.  So Kevin's summary is leaving out one key desirable
property:

(4) ability to change scannability state without a full table rewrite.

Putting the state into pg_class would preserve that property.

                        regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to