On 05.07.2011 00:42, Florian Pflug wrote:
On Jul4, 2011, at 23:11 , Peter Geoghegan wrote:
On 4 July 2011 17:36, Florian Pflug<f...@phlo.org>  wrote:
Btw, with the death-watch / life-sign / whatever infrastructure in place,
shouldn't PostmasterIsAlive() be using that instead of getppid() / kill(0)?

Hmm, maybe. That seems like a separate issue though, that can be
addressed with another patch. It does have the considerable
disadvantage of making Heikki's proposed assertion failure useless. Is
the implementation of PostmasterIsAlive() really a problem at the
moment?

I'm not sure that there is currently a guarantee that PostmasterIsAlive
will returns false immediately after select() indicates postmaster
death. If e.g. the postmaster's parent is still running (which happens
for example if you launch postgres via daemontools), the re-parenting of
backends to init might not happen until the postmaster zombie has been
vanquished by its parent's call of waitpid(). It's not entirely
inconceivable for getppid() to then return the (dead) postmaster's pid
until that waitpid() call has occurred.

Good point, and testing shows that that is exactly what happens at least on Linux (see attached test program). So, as the code stands, the children will go into a busy loop until the grandparent calls waitpid(). That's not good.

In that light, I agree we should replace kill() in PostmasterIsAlive() with read() on the pipe. It would react faster than the kill()-based test, which seems like a good thing. Or perhaps do both, and return false if either test says the postmaster is dead.

--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com
#include <stdio.h>
#include <unistd.h>

int main(int argc, char **argv)
{
	pid_t parentpid,
		childpid;
	int pipefds[2];

	if ((parentpid = fork()) == 0)
	{
		parentpid = getpid();
		pipe(pipefds);

		if ((childpid = fork()) == 0)
		{
			char c;
			int rc;

			close(pipefds[1]);

			/* Wait for pipe to close */
			printf ("read() returned %d\n", read(pipefds[0], &c, 1));
			while (kill(parentpid, 0) == 0)
			{
				printf("kill() says the parent is still alive\n");
				sleep(1);
			}
			printf("kill() confirms that the parent is dead\n");
		}
		else
		{
			sleep(3);
			printf("parent exits now\n");
		}
	}
	else
	{
		printf("grandparent is sleeping\n");
		sleep(6);
		printf("grandparent exits now\n");
	}
}
-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to