On Thu, November 16, 2006 00:00, Leandro Lucarella wrote: >> otherwise you shouldn't see SIGPIPE at all. The way it normally works >> is >> this: >> >> 1. Your backend goes down, dropping its end of the connecting socket. >> >> 2. The C API, libpq, gets an error return code on the next attempt to >> use >> the socket and handles it by noting that the connection has died. > > No, the C API, libpq, does not use MSG_NOPIPE when send()ing and > recv()ing (I've checked the source code), so when the other end of the > connection goes down, a SIGPIPE signal is sent to the process.
But AFAICS the send/recv operation should *also*, after the signal has been handled, fail and return an errno that describes the situation. What libpq does is just check for a negative return value and read errno. >> I haven't tried killing the backend while a libpq/libpqxx client was >> locally connected, so I haven't run across the SIGPIPE. > > I insist this has nothing to do with running locally or remotely (at > least if "backend" is what I think it is, the postgresql server, but > maybe I'm wrong, I'm new to postgresql). Yes, "backend" is the server process. It's been over a year since I last looked into this particular bit of error handling in libpq, so I'm a bit fuzzy on the details. It's the E* error codes, not the SIG* signals that matter here--and IIRC there are separate ones for broken Unix-domain connections and broken TCP connections. If that is the case, the signal may be the same for both cases even if the errno codes are different. Shock horror update: it looks like the fix for the libpq bug I mentioned did not make it into CVS somehow! Check pqReadData() and pqSendSome() here: http://developer.postgresql.org/cvsweb.cgi/pgsql/src/interfaces/libpq/fe-misc.c?rev=1.130;content-type=text%2Fx-cvsweb-markup The default way of handling errors there, apart from a series of known error codes, is still to issue an error message but leave the connection in "CONNECTION_OK" state. Unless it's been handled elsewhere, that could hide some types of connection error and possibly make libpqxx and libpq itself go on trying for longer than necessary. See first discussion here (I can't find a followup discussion I do remember taking place): http://www.nabble.com/libpq-and-connection-failures-tf123204.html#a339088 > Yes, that's what I plan to do, but I wanted to check if is there any > more elegant solution, to tell libpqxx to tell libpq to use MSG_NOPIPE > =) and/or to check if this is a known issue and to collect others > experience. Possibly the core team felt that applications might want to see the signals for itself. You may have to delve into the main postgres mailing lists. I did some digging myself, and I fear there may be a lot more coming. :-/ >> IIRC the bug was fixed in updates of all supported major versions around >> the time 8.1 came out. > > I'm using postgresql 8.1 and libpqxx 2.6.8. I can discard this > possibility? Now that I see that the fix is not in CVS as I thought, no. :-( >> Three recommendations: set SIG_PIPE to SIG_IGN; ensure your libpq is up >> to >> date; and if you still have the slow timeouts after that, mess with your >> networking stack (very carefully of course) to make it give up faster. > > So the keep-alive solution is discarded? I don't like to mess arround > with the TCP general configuration because postgres is not the only > service in the machine and I other services don't need so short timeouts. I could try to build some form of keepalive support, but I don't have much time to work on it at the moment. There does seem to be some keepalive mechanism in libpq; I guess that using it would also require help from the application. Jeroen _______________________________________________ Libpqxx-general mailing list [email protected] http://gborg.postgresql.org/mailman/listinfo/libpqxx-general
