Hello, I will join this thread ( without being invited :P ) because i am having the same problem with pgpool.
I have it running for 1 week on a high load environment and I decided to test online recovery, that always worked on test env, but i get : ERROR: pid 1199: wait_connection_closed: existing connections did not close in 90 sec. I tested with different values for client_idle_limit_in_recovery and recovery_timeout , but all of them fail. Pgpool simply can't close all connections. Just sharing another experience and hopefully waiting for a possible solution. Regards, --- Fernando Marcelo www.consultorpc.com [email protected] Em 15/01/2010, às 06:11, Christophe Philemotte escreveu: > Hi Jaume Sabater, > > Thanks for answering me. > >>> Before to present you my problem, just a question about the 2nd stage >>> (you'll understand that this question is linked to my problem). Why the >>> client connections have to be closed during this stage? Couldn't the >>> recovered node catch up with the master node without stopping the service? >> >> At the second stage, current idle connections are closed, and open >> connections are offered some time to finish their work before being >> closed. Connections need to be closed before the node being recovered >> is started as, when it starts, it will obtain and process the pending >> log files. This will put in the two nodes in sync, and then the queued >> requests will be processed. >> >> Conclusion: only when all connections are closed, pgpool-II can be >> sure that the two nodes will be perfectly in sync. > > Tell me if I'm wrong. That means that it is impossible to recover online > if there are heavily used persistent opened connections. And I have to > design my client application to not use persistent connection if I would > like to perform online recovery. Is it correct? > >>> Now, let me present you my problem. When I test online recovery during a >>> typical database load, I've obtained two failed scenarios: >>> 1. when the client_idle_limit_in_recovery is set (the best found value >>> is 10s), the online recovery is done, but a few client requests have >>> failed (timeout or closed connection); >>> 2. when the client_idle_limit_in_recovery isn't set, the online recovery >>> is not done, because of a few client connections that cannot be closed >>> (There are effectively used by client processes, not lazy ones). >> >> I have also been having problems under certain circumstances when >> trying to recover a node since I first started with pgpool-II. Tatsuo >> (and other contributors) have fixed some of these scenarios, but I >> think there is still the problem that certain connections are not >> dropped during the second stage, hence bloating the whole process. > > Without using client_idle_limit_in_recovery, it is what I have noticed. > >> I believe, but I cannot be sure, than when the DBA of my main client >> has pgadmin3 open, it always fails. Usually, when connections are only >> from the front-ends of the web platform (i.e. open connection, send a >> request, (supposedly) close connection), everything goes fine. But >> with "persistent" connections, so to speak, pgpool-II is not always >> capable of dropping them. > > Ok, it is my feeling I've just exposed above. > >> Does it make any sense to you, Tatsuo? > Does it? > > Regards, > > Christophe Philemotte > _______________________________________________ > Pgpool-general mailing list > [email protected] > http://pgfoundry.org/mailman/listinfo/pgpool-general _______________________________________________ Pgpool-general mailing list [email protected] http://pgfoundry.org/mailman/listinfo/pgpool-general
