On 3/21/17 8:45 PM, Venkata B Nagothi wrote:
On Tue, Mar 21, 2017 at 8:46 AM, David Steele <da...@pgmasters.net
Unfortunately, I don't think the first patch (recoveryStartPoint)
will work as currently implemented. The problem I see is that the
new function recoveryStartsHere() depends on pg_control containing a
checkpoint right at the end of the backup. There's no guarantee
that this is true, even if pg_control is copied last. That means a
time, lsn, or xid that occurs somewhere in the middle of the backup
can be selected without complaint from this code depending on timing.
Yes, that is true. The latest best position, checkpoint position, xid
and timestamp of the restored backup of the backup is shown up in the
pg_controldata, which means, that is the position from which the
recovery would start.
Backup recovery starts from the checkpoint in the backup_label, not from
the checkpoint in pg_control. The original checkpoint that started the
backup is generally overwritten in pg_control by the end of the backup.
Which in-turn means, WALs start getting replayed
from that position towards --> minimum recovery position (which is the
end backup, which again means applying WALs generated between start and
the end backup) all the way through to --> recovery target position.
minRecoveryPoint is only used when recovering a backup that was made
from a standby. For backups taken on the master, the backup end WAL
record is used.
The best start position to check with would the position shown up in the
pg_control file, which is way much better compared to the current
I don't agree, for the reasons given previously.
The tests pass (or rather fail as expected) because they are written
using values before the start of the backup. It's actually the end
of the backup (where the database becomes consistent on recovery)
that defines where PITR can start and this distinction definitely
matters for very long backups. A better test would be to start the
backup, get a time/lsn/xid after writing some data, and then make
sure that time/lsn/xid is flagged as invalid on recovery.
The current behaviour is that, if the XID shown-up by the pg_controldata
is say 1400 and the specified recovery_target_xid is say 200, then,
postgresql just completes the recovery and promotes at the immediate
consistent recovery target, which means, the DBA has to all the way
start the restoration and recovery process again to promote the database
at the intended position.
I'm not saying that the current behavior is good, only that the proposed
behavior is incorrect as far as I can tell.
It is also problematic to assume that transaction IDs commit in
order. If checkPoint.latestCompletedXid contains 5 then a recovery
to xid 4 may or may not be successful depending on the commit
order. However, it appears in this case the patch would disallow
recovery to 4.
If the txid 4 is the recovery target xid, then, the appropriate backup
taken previous to txid 4 must be used or as an alternative recovery
target like recovery_target_timestamp must be used to proceed to the
respective recovery target xid.
I'm not sure I follow you here, but I do know that using the last
committed xid says little about which xids precede or follow it.
I think this logic would need to be baked into recoveryStopsAfter()
and/or recoveryStopsBefore() in order to work. It's not clear to me
what that would look like, however, or if the xid check is even
The above two functions would determine if the recovery process has to
stop at a particular position or not and the patch has nothing to do
To summarise, this patch only determines if the WAL replay has to start
at all. In other words, if the specified recovery target falls prior to
the restored database position, then, the WAL replay should not start,
which prevent the database from getting promoted at an unintended
I understand what you are trying to do. My point is that the
information you are working from (whatever checkpoint happens to be in
pg_control when recovery starts) is not sufficient for the task. This
method is inaccurate and would disallow valid recovery scenarios.
I suggest doing a thorough read-through of StartupXLOG(), particularly
the section where the backup_label is loaded.
Sent via pgsql-hackers mailing list (firstname.lastname@example.org)
To make changes to your subscription: