I wrote:
> Hmm.  I think the problem is that poll_start() thinks it can just call
> start() a second time after a failure.  If it wasn't a true failure
> but a timeout, then _pid is now set and the second call complains.

Oh, wait --- the case that is failing is after 017_shm.pl has
intentionally kill -9'd a postmaster, so that its pidfile is
left behind.  The next attempted start fails on shmem id
conflict, but it doesn't remove the old pidfile, and then
the code I added to sub start erroneously picks that up
as a live postmaster PID.

Seems like we need to do 'kill 0' on the PID we get from
the file to verify that there's really a postmaster there.
(I wonder how well that works on Windows?  perlport claims
it does, but ...)

I fear I still don't have the whole story though because
per this theory it should fail everywhere, yet it doesn't.

                        regards, tom lane


Reply via email to