Patch solves the problem with blocking backend in pgwin32_waitforsinglesocket() when it tries to send something to stat collector. Patch makes two thing:
1) pgwin32_waitforsinglesocket(): WaitForMultipleObjectsEx now called with finite timeout (100ms) in case of FP_WRITE and UDP socket. If timeout occursthen pgwin32_waitforsinglesocket() returns EINTR. Reason: As it follows from tests (see below) process may sleep forever in WaitForMultipleObjectsEx in case of infinite timeout.
2) pgwin32_send(): add loop around WSASend and pgwin32_waitforsinglesocket(). The reason is: for UDP socket, 'ok' result from pgwin32_waitforsinglesocket()isn't guarantee that socket is still free, it can become busy again and following WSASend call will fail with WSAEWOULDBLOCK error.
Note, situations above occur only on very high load and very rare. About 1 time per several hours. Personally, I don't like 1) patch way, but I can't find better solution.To simulate the bug, I developed test suite (http://www.sigaev.ru/misc/wintest.tgz). Test runs one 'collector' and several (32 by defaults) clients, which send a lot of packets to collector. Socket library is taken from pgsql directly. Installation & testing (under MinGW):
% tar xzvf wintest.tgz % cd wintest % make % ./serveres Archive contains two socket.c: socket.c.orig - as it in pgsql socket.c - already patched fprintf() calls are added to pgwin32_waitforsinglesocket() and in case of socket.c.orig several clients never go out. Usually, it's needed 1-3 minutes to reproduce. Test suite works harder than pgsql, and block occurs even on uniprocessor box. It may be needed to increase number of clients to reliable reproduce the bug. Objections, comments, advices, suggestions?I intend to commit patch to all affected branches today or tomorrow if there are no objections or better ideas.
-- Teodor Sigaev E-mail: [EMAIL PROTECTED] WWW: http://www.sigaev.ru/
Description: Unix tar archive
---------------------------(end of broadcast)--------------------------- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq