On Wed, Oct 29, 2014 at 6:21 AM, Maeldron T. <maeld...@gmail.com> wrote: > I swear I have read a couple of old threads. Yet I am not sure if it safe to > failback to the old master in case of async replication without base backup. > > Considering: > I have the latest 9.3 server > A: master > B: slave > B is actively connected to A > > I shut down A manually with -m fast (it's the default FreeBSD init script > setting) > I remove the recovery.conf from B > I restart B > I create a recovery.conf on A > I start A > I see nothing wrong in the logs > I go for a lunch > I shut down B > I remove the recovery.conf on AI restart A > I restore the recovery.conf on B > I start B > I see nothing wrong in the logs and I see that replication is working > > Can I say that my data is safe in this case? > > If the answer is yes, is it safe to do this if there was a power outage on A > instead of manual shutdown? Considering that the log says nothing wrong. (Of > course if it complains I'd do base backup from B).
The threshold question here is whether the original master might have written (and thus, perhaps, applied) write-ahead log records that were not replayed on the slave. If A crashed, that is definitely possible, so this is definitely not safe. If A was shut down cleanly, then streaming replication *should* take everything up through the shutdown checkpoint and replicate those to the standby, which *should* replay them. If all goes according to plan, I think this will work. I'm not sure we really have enough safeties to make this robust, though: for example, at the point when the shutdown checkpoint is written, I believe that the master is no longer accepting new connections - so if the connection to the slave is broken before the shutdown checkpoint record is replicated, then it's not safe any more, but how will we detect that? And, if you remove recovery.conf on the slave, it will abort replay and enter normal running as soon as it reaches what it thinks is end-of-WAL, with no cross-check to make sure that's really the same was point that the master was actually at. So it strikes me that it might be quite difficult to really have confidence that nothing will go wrong. I'm definitely not the expert in this area on this mailing list, so I'm curious what others think. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers