On Fri, Oct 15, 2010 at 8:41 AM, Fujii Masao <masao.fu...@gmail.com> wrote: > Hi, > > As the result of the discussion, I think that we need the following two > parameters for the case where the standby goes down. > > * replication_timeout > This is the maximum time to wait for the ACK from the standby. If this > timeout expires, the master closes the replication connection and > disconnects the standby. This parameter is just used for the master > to detect the standby crash or the network outage. > > We already have keepalive parameters for that purpose. But they cannot > detect the disconnection in some cases. So replication_timeout needs > to be introduced for sync rep.
Good design, +1. > * allow_standalone_master > This specifies whether we allow the master to process transactions > alone when there is no connected and sync'd standby. > > If this is false, all the transactions on the master are blocked until > sync'd standby has appeared. Of course, this happen not only when > replication_timeout expires but also when we start the master alone > at the initial setup, when the master detects the disconnection by > using keepalive parameters, and when the standby is shut down normally. > People who want 'wait-forever' will disable this parameter to reduce > the risk of data loss. > > OTOH, if this is true, the absence of sync'd standby doesn't prevent > the master from processing transactions alone. People who want high > availability even though the risk of data loss increases will enable > this parameter. I'm not wild about the name, but otherwise this seems well-designed. > The timeout doesn't oppose to 'wait-forever'. Even if you choose 'wait > -forever' (i.e., you set allow_standalone_master to false), the master > should detect the standby crash as soon as possible by using the > timeout. For example, imagine that max_wal_senders is set to one and > the master cannot detect the standby crash because of absence of the > timeout. In this case, even if you start new standby, it will not be > able to connect to the master since there is no free walsender slot. > As the result, the master actually waits forever. Good point. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers