As I posted before, changing the timeout from 1000 to
NMPWAIT_WAIT_FOREVER fixed the problem, or at least moved it such it
does not occur easily anymore.

To better understand the problem, I added debugging as Tom suggested.  I
restored timeout on CalledNamedPipe 1000 ms, and reran my tests.
Indeed, kill is encountering an error:
LOG:  kill(2168) failed: No such process

I instrumented pgkill to output the value of GetLastError() if
CalledNamedPipe fails.  It returned error code 2, which Windows
identifies as ERROR_FILE_NOT_FOUND.  The logic in pgkill translates this
Windows error into an errno value of ESRCH.

The Windows error is a bit surprising, at least to me -- I expected
something indicating the pipe was full. Does anyone have a richer
interpretation of this error?

Thanks,
Steve

-----Original Message-----
From: Tom Lane [mailto:t...@sss.pgh.pa.us] 

Alvaro Herrera <alvhe...@commandprompt.com> writes:
> Marshall, Steve wrote:
>> Any thoughts on how to confirm or deny Theory A?

> Try changing the 1000 to NMPWAIT_WAIT_FOREVER


As long as you're changing the source code, it'd be a good idea to
verify the supposition that kill() is failing, eg in
src/backend/commands/async.c

                        if (kill(listenerPID, SIGUSR2) < 0)
                        {
+                               elog(LOG, "kill(%d) failed: %m",
listenerPID);
                                /*
                                 * Get rid of pg_listener entry if it
refers to a PID that no
                                 * longer exists.  Presumably, that
backend crashed without
                                 * deleting its pg_listener entries.
This code used to only


If that's right, sprinkling a few debug printouts into src/port/kill.c
would be the next step.

                        regards, tom lane


-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

Reply via email to