I've been hacking on a tool to allow resynchronizing an old master server after failover. The need to do a full backup/restore has been a common complaint ever since we've had streaming replication. I saw on the wiki that this was discussed in the dev meeting; too bad I couldn't make it.

In a nutshell, the idea is to do copy everything that has changed between the cluster, like rsync does, but instead of reading through all files, use the WAL to determine what has changed. Here's a somewhat more detailed explanation, from the README:

Theory of operation

The basic idea is to copy everything from the new cluster to old, except for the blocks that we know to be the same.

1. Scan the WAL log of the old cluster, starting from the point where
the new cluster's timeline history forked off from the old cluster. For each WAL record, make a note of the data blocks that are touched. This yields a list of all the data blocks that were changed in the old cluster, after the new cluster forked off.

2. Copy all those changed blocks from the new master to the old master.

3. Copy all other files like clog, conf files etc. from the new cluster
to old. Everything except the relation files.

4. Apply the WAL from the new master, starting from the checkpoint
created at failover. (pg_rewind doesn't actually apply the WAL, it just creates a backup label file indicating that when PostgreSQL is started, it will start replay from that checkpoint and apply all the required WAL)

Please take a look: https://github.com/vmware/pg_rewind

- Heikki

Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:

Reply via email to