Fujii Masao escribió:
> On Thu, Sep 17, 2009 at 5:08 PM, Heikki Linnakangas
> <heikki.linnakan...@enterprisedb.com> wrote:
> > Walreceiver is really a slave to the startup process. The startup
> > process decides when it's launched, and it's the startup process that
> > then waits for it to advance. But the way it's set up at the moment, the
> > startup process needs to ask the postmaster to start it up, and it
> > doesn't look very robust to me. For example, if launching walreceiver
> > fails for some reason, startup process will just hang waiting for it.
> 
> I changed the postmaster to report the failure of  fork of the walreceiver
> to the startup process by resetting WalRcv->in_progress, which prevents
> the startup process from getting stuck when launching walreceiver fails.
> http://archives.postgresql.org/pgsql-hackers/2009-09/msg01996.php
> 
> Do you have another concern about the robustness? If yes, I'll address that.

Hmm.  Without looking at the patch at all, this seems similar to how
autovacuum does things: autovac launcher signals postmaster that a
worker needs to be started.  Postmaster proceeds to fork a worker.  This
could obviously fail for a lot of reasons.

Now, there is code in place to notify the user when forking fails, and
this is seen on the wild quite a bit more than one would like :-(  I
think it would be a good idea to have a retry mechanism in the
walreceiver startup mechanism so that recovery does not get stuck due to
transient problems.

-- 
Alvaro Herrera                                http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to