Robert,

> > What would such a test look like?  It's not obvious to me that
> > there's any rapid way for a user to detect this situation, without
> > checking each server individually.
> 
> Change something on the master and observe that none of the supposed
> standbys notice?

That doesn't sound like an infallible test, or a 60-second one.

My point is that in a complex situation (imagine a shop with 9 replicated 
servers in 3 different cascaded groups, immediately after a failover of the 
original master), it would be easy for a sysadmin, responding to middle of the 
night page, to accidentally fat-finger an IP address and create a cycle instead 
of a new master.  And once he's done that, a longish troubleshooting process to 
figure out what's wrong and why writes aren't working, especially if he goes to 
bed and some other sysadmin picks up the "Writes failing to PostgreSQL" ticket.

*if* it's relatively easy for us to detect cycles (that's a big if, I'm not 
sure how we'd do it), then it would help a lot for us to at least emit a 
WARNING.  That would short-cut a lot of troubleshooting.

--Josh Berkus


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to