On 11/12/2014 05:08 PM, Robert Haas wrote:
On Wed, Nov 12, 2014 at 7:39 AM, Heikki Linnakangas
<[email protected]> wrote:
2. When ReadBufferExtended doesn't find the page in cache, it returns the
buffer in !BM_VALID state (i.e. still in I/O in-progress state). Require the
caller to call a second function, after locking the page, to finish the I/O.

This seems like a reasonable approach.

If you tilt your head the right way, zeroing a page and restoring a
backup block are the same thing: either way, you want to "read" the
block into shared buffers without actually reading it, so that you can
overwrite the prior contents with something else.  So, you could fix
this by adding a new mode, RBM_OVERWRITE, and passing the new page
contents as an additional argument to ReadBufferExtended, which would
then memcpy() that data into place where RBM_ZERO calls MemSet() to
zero it.

Yes, that would be quite a clean API. However, there's a problem with locking, when the redo routine modifies multiple pages. Currently, you lock the page first, and replace the page with the new contents while holding the lock. With RBM_OVERWRITE, the new page contents would sneak into the buffer before RestoreBackupBlock has acquired the lock on the page, and another backend might pin and lock the page before RestoreBackupBlock does. The page contents would be valid, but they might not be consistent with other buffers yet. The redo routine might be doing an atomic operation that spans multiple pages, by holding the locks on all the pages until it's finished with all the changes, but the backend would see a partial result.

- Heikki



--
Sent via pgsql-hackers mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to