On 3/26/17 7:34 PM, Venkata B Nagothi wrote:
> Hi David, 
> 
> On Thu, Mar 23, 2017 at 4:21 AM, David Steele <da...@pgmasters.net
> <mailto:da...@pgmasters.net>> wrote:
> 
>     On 3/21/17 8:45 PM, Venkata B Nagothi wrote:
> 
>         On Tue, Mar 21, 2017 at 8:46 AM, David Steele
>         <da...@pgmasters.net <mailto:da...@pgmasters.net>
> 
>             Unfortunately, I don't think the first patch
>         (recoveryStartPoint)
>             will work as currently implemented.  The problem I see is
>         that the
>             new function recoveryStartsHere() depends on pg_control
>         containing a
>             checkpoint right at the end of the backup.  There's no guarantee
>             that this is true, even if pg_control is copied last.  That
>         means a
>             time, lsn, or xid that occurs somewhere in the middle of the
>         backup
>             can be selected without complaint from this code depending
>         on timing.
> 
> 
>         Yes, that is true.  The latest best position, checkpoint
>         position, xid
>         and timestamp of the restored backup of the backup is shown up
>         in the
>         pg_controldata, which means, that is the position from which the
>         recovery would start.
> 
> 
>     Backup recovery starts from the checkpoint in the backup_label, not
>     from the checkpoint in pg_control.  The original checkpoint that
>     started the backup is generally overwritten in pg_control by the end
>     of the backup.
> 
> 
> Yes, I totally agree. My initial intention was to compare the recovery
> target position(s) with the contents in the backup_label, but, then, the
> checks would fails if the recovery is performed without the backup_label
> file. Then, i decided to compare the recovery target positions with the
> contents in the pg_control file.
> 
> 
>         Which in-turn means, WALs start getting replayed
>         from that position towards --> minimum recovery position (which
>         is the
>         end backup, which again means applying WALs generated between
>         start and
>         the end backup) all the way through to  --> recovery target
>         position.
> 
> 
>     minRecoveryPoint is only used when recovering a backup that was made
>     from a standby.  For backups taken on the master, the backup end WAL
>     record is used.
> 
>         The best start position to check with would the position shown
>         up in the
>         pg_control file, which is way much better compared to the current
>         postgresql behaviour.
> 
> 
>     I don't agree, for the reasons given previously.
> 
> 
> As explained above, my intention was to ensure that the recovery start
> positions checks are successfully performed irrespective of the presence
> of the backup_label file.
> 
> I did some analysis before deciding to use pg_controldata's output
> instead of backup_label file contents.
> 
> Comparing the output of the pg_controldata with the contents of
> backup_label contents.
> 
>     *Recovery Target LSN*
> 
>     START WAL LOCATION (which is 0/9C000028) in the backup_label =
>     pg_controldata's latest checkpoint's REDO location (Latest
>     checkpoint's REDO location:  0/9C000028)
> 
>     *Recovery Target TIME*
> 
>     backup start time in the backup_label (START TIME: 2017-03-26
>     11:55:46 AEDT) = pg_controldata's latest checkpoint time (Time of
>     latest checkpoint :  Sun 26 Mar 2017 11:55:46 AM AEDT)
> 
>     *Recovery Target XID*
> 
>     To begin with backup_label does contain any start XID. So, the only
>     option is to depend on pg_controldata's output. 
>     After a few quick tests and thorough observation, i do notice that,
>     the pg_control file information is copied as it is to the backup
>     location at the pg_start_backup. I performed some quick tests by
>     running few transactions between pg_start_backup and pg_stop_backup.
>     So, i believe, this is ideal start point for WAL replay.
> 
>     Am i missing anything here ?

You are making assumptions about the contents of pg_control vs.
backup_label based on trivial tests.  With PG defaults, the backup must
run about five minutes before the values in pg_control and backup_label
will diverge.  Even if pg_control and backup_label do match, those are
the wrong values to use, and will get more incorrect the longer the
backup runs.

I believe there is a correct way to go about this, at least for time and
LSN, and I don't think your very approximate solution will pass muster
with a committer.

Since we are nearly at the end of the CF, I have marked this submission
"Returned with Feedback".

-- 
-David
da...@pgmasters.net


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to