> Hi all, > > Attached are two patches which in combination make > pg_stat_activity work reliably for us on Windows. > > The mysterious socket error turned out to be WSAEWOULDBLOCK. > Per > http://msdn.microsoft.com/library/default.asp?url=/library/en- > us/winsock/winsock/windows_sockets_error_codes_2.asp > , it seems the thing to do is loop and try again. pipe.patch > does that.
Good catch. I always knew there was something afoot in that code, but never thought it was that simple ;-) I kinda got snowed in on the idea that it had to do with inheritance :-( Also, might we want a CHECK_FOR_INTERRUPTS in the loop? Since it's a potential stuck-forever place. Oh, and checking the code go pgwin32_recv, I think I see where this is coming from: pgwin32_recv calls pgwin32_waitforsinglesocket(). If that one succeeds *and a signal is delivered while it's waiting*, we'll get out og pgwin32_waitforsinglesocket() without clearing the WSA code. The rest of the pg code uses errno which is properly set to EINTR, but the pipe code uses WSAGetLastError() directly. The fix for that is probably to add a WSASetLastError(WSAEINTR) right after the errno=EINTR in pgwin32_waitforsinglesocket(). Does this seem right? Can you try by adding this, and then backing out your pgstats patch, and see if this alone is enough? BTW, what's with the change to all the error msgs? And finally, the error handling looks a bit off? We specifically *don't* want it to log an error for the WSAECONNRESET state - it's a normal state. Or am I reading the patch wrong? > The one remaining problem is that there seems to be a race > condition when installing the temporary stats file on > Windows. As we were monitoring pg_stat_activity during the > test run, occasionally we'd get a response with zero rows. > This may not be much of a problem during normal conditions > (the server was completely overloaded and we were banging > away with "Up Arrow", "Enter" watching pg_stat_activity). > > What's the best way to do an atomic rename on Windows? > Alternatively, would it make sense to sleep and try again (up > to some limit) when trying to open the stats file on Windows? rename() should be atomic. We do a special version on win32 (see http://developer.postgresql.org/cvsweb.cgi/pgsql/src/port/dirmod.c?rev=1 .42), but I don't see how that would change the atomicity of it. Do you get anything at all in the logs when this happens? Are you sure the reason is that it picks up an empty file, or could it be something else? //Magnus ---------------------------(end of broadcast)--------------------------- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match