Re: [HACKERS] Spreading full-page writes

Heikki Linnakangas Tue, 27 May 2014 02:09:08 -0700

On 05/26/2014 02:26 PM, Greg Stark wrote:

On Mon, May 26, 2014 at 1:22 PM, Heikki Linnakangas <[email protected]

wrote:

The second record is generated before the checkpoint is finished and the
checkpoint record is written.  So it will be there.

(if you crash before the checkpoint is finished, the in-progress
checkpoint is no good for recovery anyway, and won't be used)


Another idea would be to have separate checkpoints for each buffer
partition. You would have to start recovery from the oldest checkpoint of
any of the partitions.

Yeah. Simon suggested that when we talked about this, but I didn'tunderstand how that works at the time. I think I do now. The key tomaking it work is distinguishing, when starting recovery from the latestcheckpoint, whether a record for a given page can be replayed safely. Iused flags on WAL records in my proposal to achieve this, but usingbuffer partitions is simpler.

For simplicity, let's imagine that we have two Redo-pointers for eachcheckpoint record: one for even-numbered pages, and another forodd-numbered pages. When checkpoint begins, we first update theEven-redo pointer to the current WAL insert location, and then flush allthe even-numbered buffers in the buffer cache. Then we do the same for Odd.

Recovery begins at the Even-redo pointer. Replay works as normal, butuntil you reach the Odd-pointer, you refrain from replaying any changesto Odd-numbered pages. After reaching the odd-pointer, you replayeverything as normal.


Hmm, that seems actually doable...

- Heikki


--
Sent via pgsql-hackers mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Spreading full-page writes

Reply via email to