On Tue, Jul 17, 2012 at 9:16 PM, Bruce Momjian <br...@momjian.us> wrote:
> WAL is not guaranteed to be the same between PG major versions, so doing
> anything with WAL is pretty much a no-go.

I understand that the WAL format changes, sometimes dramatically
between versions. What I'm suggesting that the first WAL-record
emitted by the binary upgrade process could be entitled "WAL-stream
upgrade to 9.4" that would fail to be understood by old versions or
possibly understood to mean "stop replay, you won't even understand
what's about to be said."

At that point, start up new version in the same cluster and have it
continue replay from that position on forward, which should all be in
the new format that it can understand.  It need not understand the old
format in that case, but the tricky part is this single record that
tells the replayer of the old version to stop while a replayer of the
new version somehow will know it is the right place to start.

One mechanism could be a WAL file segment boundary: the standby could
be told to exit when it finishes recovery of the segment
0000000100001234000055CD, and to start the new version beginning
recovery at 0000000100001234000055CF (one higher), and that would be
the first WAL emitted by pg_upgrade. In principle the same is possible
using the fine-grained record position, such as XXXXX/NN, but may be
more complex for not much gain.

This also means the database would be stuck in an inconsistent state
when it starts, not unlike when recovering from a on-line base backup.
 And that's totally reasonable: the new version has to start up
presuming that the database cluster makes not enough sense to enter
hot standby yet.

Yet another mechanism is to not have the Postgres recovery-process
apply the WAL, but rather some special purpose program that knows how
to count through and apply specially-formatted WAL segments, and then
set the resultant cluster to start recovering from the WAL past this
span of specially-formatted WAL.  The crux is to get some continuity
in this stream, and there are many ways to slice it. Otherwise, the
continuous archives will have a gap while a new base backup is taken
of data that mostly rests unchanged.


Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:

Reply via email to