Standby trying "restore_command" before local WAL

Emre Hasegeli Tue, 31 Jul 2018 05:56:58 -0700

Currently the startup process tries the "restore_command" before
the WAL files locally available under pg_wal/ [1].  I believe we should
change this behavior.

== The Problem ==

This issue came to our attention after we migrated an application from
an object storage backend, and noticed that restarting a standby node
takes hours or sometimes days.

We are using shared WAL archive and find it practical to have
"restore_command" configured in case we would left a standby offline
for a long time. However, during the short window the standby is
restarted, the master manages to archive a segment. Then,
the standby executes "restore_command" successfully, and continues
downloading WAL from the archive causing the recovery to take orders
of magnitude longer.

== The Workarounds ==

We can possibly work around this inside the "restore_command" or
by delaying the archiving. Working around inside the "restore_command"
would involve checking whether the file exists under pg_wal/. This
should not be easy because the WAL file may be written partially. It
should be easier for Postgres to do this as it knows where to stop
processing the local WAL.

It should also be possible to work around this problem by delaying
archiving using "wal_keep_segments", or replication slots, or simply
with sleep(). Though, none of those is the correct solution to
the problem. We don't need the master to keep more segments for
the standbys. We already have more than enough.

== The Change ==

This "restore_command" behavior is coming from the initial archiving
and point-in-time-recovery implementation [2]. The code says
"the reason is that the file in XLOGDIR could be an old, un-filled or
partly-filled version that was copied and restored as part of
backing up $PGDATA." This was probably a good reason in 2004, but
I don't think it still is. AFAIK "pg_basebackup" eliminates this
problem. Also, with this reasoning, we should also try streaming from
the master before trying the local WAL, but AFAIU we don't.

If there will be a consensus on fixing this, I can try to prepare
a patch. The company, I am currently working for, is also interested
in sponsoring a support company to fix this problem.

[1]
https://git.postgresql.org/gitweb/?p=postgresql.git;a=blob;f=src/backend/access/transam/xlogarchive.c;h=5c6de4989c9a#l72

[2]
https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=66ec2db7284

Standby trying "restore_command" before local WAL

Reply via email to