Russ, pgpool-II can call a script every time there is a failover event (see 'failover_command' in the manual). You can create a script that calls pcp_node_attach to reattach the failed node once the access to it is recovered. For example, keep testing "SELECT 1;" against the remote database until it returns correctly, meaning there is connectivity. After this, issue the pcp_node_attach of that node. And that's it.
BTW, looking at your pgpool.conf, you only have one node configured in pgpool, so I'm assuming this is correct. Daniel > -----Original Message----- > From: [email protected] [mailto:pgpool-general- > [email protected]] On Behalf Of Russ Neufeld > Sent: Tuesday, May 25, 2010 2:58 PM > To: [email protected] > Subject: [Pgpool-general] Recovery from network outage > > Hi all, > > How do I set up pgpool to recover from the occasion network > outage? This morning we briefly lost network connectivity between our > web machine and our db machine, and this showed up in > /var/log/messages: > > May 25 06:35:05 web pgpool: 2010-05-25 06:35:05 ERROR: pid 17169: > connect_inet_domain_socket: connect() failed: No route to host > May 25 06:35:05 web pgpool: 2010-05-25 06:35:05 ERROR: pid 17169: > connection to 10.177.77.115(5432) failed > May 25 06:35:05 web pgpool: 2010-05-25 06:35:05 ERROR: pid 17169: > new_connection: create_cp() failed > May 25 06:35:05 web pgpool: 2010-05-25 06:35:05 LOG: pid 17169: > notice_backend_error: 0 fail over request from pid 17169 > May 25 06:35:05 web pgpool: 2010-05-25 06:35:05 LOG: pid 16889: > starting degeneration. shutdown host 10.177.77.115(5432) > May 25 06:35:05 web pgpool: 2010-05-25 06:35:05 ERROR: pid 16889: > failover_handler: no valid DB node found > May 25 06:35:05 web pgpool: 2010-05-25 06:35:05 LOG: pid 16889: > failover_handler: set new master node: 1 > May 25 06:35:05 web pgpool: 2010-05-25 06:35:05 LOG: pid 16889: > failover done. shutdown host 10.177.77.115(5432) > > I needed to restart pgpool manually for it to recover. Here's > what our pgpool.conf looks like: > > listen_addresses = 'localhost' > port = 5432 > enable_pool_hba = true > replication_mode = false > load_balance_mode = false > master_slave_mode = false > backend_hostname0 = '10.177.77.115' > backend_port0 = 5432 > health_check_period = 0 > fail_over_on_backend_error = false > connection_cache = true > num_init_children = 20 > max_pool = 2 > child_life_time = 300 > connection_life_time = 0 > child_max_connections = 0 > child_idle_limit = 0 > authentication_timeout = 30 > > Do I need to play with health_check_period and/or > health_check_timeout to get this right? Is there a way to make pgpool > resilient to network blips, or is this always a manual recovery? > > Thanks, > > Russ > _______________________________________________ > Pgpool-general mailing list > [email protected] > http://pgfoundry.org/mailman/listinfo/pgpool-general _______________________________________________ Pgpool-general mailing list [email protected] http://pgfoundry.org/mailman/listinfo/pgpool-general
