Re: [HACKERS] [PATCH] A crash and subsequent recovery of themaster can cause the slave to get out-of-sync

Simon Riggs Mon, 23 Apr 2007 03:20:53 -0700

On Thu, 2007-04-19 at 22:37 +0200, Florian G. Pflug wrote:

> I believe I have discovered the following problem in pgsql 8.2 and HEAD,
> concerning warm-standbys using WAL log shipping.
> 
> The problem is that after a crash, the master might complete incomplete
> actions via rm_cleanup() - but since it won't wal-log those changes,
> the slave won't know about this. This will at least prevent the creation
> of any further restart points on the slave (because safe_restartpoint)
> will never return true again - it it might even cause data corruption,
> if subsequent wal records are interpreted wrongly by the slave because
> it sees other data than the master did when it generated them.


I agree the problem exists. It is somewhat rare because the idea is that
if the Primary crashes we would failover to the Standby, which would
mean that both Primary and Standby have executed rm_cleanup(), if
needed.

So in the case where the Primary fails and we choose *not* to failover,
there is a potential difficulty on the Standby.

> Attached is a patch that lets RecoveryRestartPoint call all
> rm_cleanup() methods and create a restart point whenever it encounters
> a shutdown checkpoint in the wal (because those are generated after
> recovery). This ought not cause a performance degradation, because
> shutdown checkpoints will occur very infrequently.
> 
> The patch is per discussion with Simon Riggs.
> 
> I've not yet had a chance to test this patch, I only made sure
> that it compiles. I'm sending this out now because I hope this
> might make it into 8.2.4.

The rationale for this fix could be described somewhat differently:

When we shutdown, we know for certain that safe_restartpoint() is true.
However, we don't know whether it is true because we successfully did a
clean shutdown, or because we crashed, recovered and then issued a
shutdown checkpoint following recovery. In the latter case we *must*
execute rm_cleanup() on the standby because it has been executed on the
primary. Not doing so at this point *might* be safe, but is not certain
to be safe. We don't *need* to log a restartpoint at this time, but it
seems sensible to do so.

We need to check that rm_cleanup() routines don't assume that they will
only ever be called once or this will clearly fail. There is also no
need to call rm_cleanup() unless rm_safe_restartpoint() is false.

-- 
  Simon Riggs             
  EnterpriseDB   http://www.enterprisedb.com



---------------------------(end of broadcast)---------------------------
TIP 1: if posting/reading through Usenet, please send an appropriate
       subscribe-nomail command to [EMAIL PROTECTED] so that your
       message can get through to the mailing list cleanly

Re: [HACKERS] [PATCH] A crash and subsequent recovery of themaster can cause the slave to get out-of-sync

Reply via email to