As I posted before, changing the timeout from 1000 to NMPWAIT_WAIT_FOREVER fixed the problem, or at least moved it such it does not occur easily anymore.
To better understand the problem, I added debugging as Tom suggested. I restored timeout on CalledNamedPipe 1000 ms, and reran my tests. Indeed, kill is encountering an error: LOG: kill(2168) failed: No such process I instrumented pgkill to output the value of GetLastError() if CalledNamedPipe fails. It returned error code 2, which Windows identifies as ERROR_FILE_NOT_FOUND. The logic in pgkill translates this Windows error into an errno value of ESRCH. The Windows error is a bit surprising, at least to me -- I expected something indicating the pipe was full. Does anyone have a richer interpretation of this error? Thanks, Steve -----Original Message----- From: Tom Lane [mailto:t...@sss.pgh.pa.us] Alvaro Herrera <alvhe...@commandprompt.com> writes: > Marshall, Steve wrote: >> Any thoughts on how to confirm or deny Theory A? > Try changing the 1000 to NMPWAIT_WAIT_FOREVER As long as you're changing the source code, it'd be a good idea to verify the supposition that kill() is failing, eg in src/backend/commands/async.c if (kill(listenerPID, SIGUSR2) < 0) { + elog(LOG, "kill(%d) failed: %m", listenerPID); /* * Get rid of pg_listener entry if it refers to a PID that no * longer exists. Presumably, that backend crashed without * deleting its pg_listener entries. This code used to only If that's right, sprinkling a few debug printouts into src/port/kill.c would be the next step. regards, tom lane -- Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-bugs