On Jul4, 2011, at 17:53 , Heikki Linnakangas wrote: >> Under Linux, select() may report a socket file descriptor as "ready for >> reading", while nevertheless a subsequent read blocks. This could for >> example happen when data has arrived but upon examination has wrong >> checksum and is discarded. There may be other circumstances in which a >> file descriptor is spuriously reported as ready. Thus it may be safer >> to use O_NONBLOCK on sockets that should not block. > > So in theory, on Linux you might WaitLatch might sometimes incorrectly return > WL_POSTMASTER_DEATH. None of the callers check for WL_POSTMASTER_DEATH return > code, they call PostmasterIsAlive() before assuming the postmaster has died, > so that won't affect correctness at the moment. I doubt that scenario can > even happen in our case, select() on a pipe that is never written to. But > maybe we should add add an assertion to WaitLatch to assert that if select() > reports that the postmaster pipe has been closed, PostmasterIsAlive() also > returns false.
The correct solution would be to read() from the pipe after select() returns, and only return WL_POSTMASTER_DEATCH if the read doesn't return EAGAIN. To prevent that read() from blocking if the read event was indeed spurious, O_NONBLOCK must be set on the pipe but that patch does that already. Btw, with the death-watch / life-sign / whatever infrastructure in place, shouldn't PostmasterIsAlive() be using that instead of getppid() / kill(0)? best regards, Florian Pflug -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers