> -----Original Message----- > From: Norbert Sendetzky [mailto:norb...@linuxnetworks.de] > Sent: Tuesday, January 25, 2011 2:30 PM > To: OpenDBX devel list > Subject: Re: [opendbx] FW: [BUGS] BUG #5837: PQstatus() fails to report lost > connection > > > This will require some changes to lib/backends/pgsql_basic.c and/or > > the OpenDBX documentation, I'm afraid. > > Could you explain to me in detail where the changes must happen? Then I > will see to make an update soon.
The background: A connection is established, some queries are run and return successfully. Then postgresql is deliberately restarted. Here's what you're doing. A call to odbx_query() after the restart returns normally (which is strange by itself...). In odbx_result(), you: - call PGgetResult(), it returns non-NULL - call PGgetResultStatus(), it reports PGRES_FATAL_ERROR - call PQstatus(), it returns CONNECTION_OK - you return -1 Then I call odbx_error_type(), which remembers that the handle got PGRES_FATAL_ERROR but also got CONNECTION_OK, so it just returns 1 (no reconnect required). The reason PQstatus() appears to give a false result is because the TCP part of the connection was fine (there was no I/O error; EOF hasn't been reached). Interestingly enough if you ask for the error string matching the fatal error at this point, it does tell you that the connection has been reset by administrator action. The problem now is that, since no reconnect is attempted, all future queries fail on that handle. The issue is that PQstatus() relies on internal state that has not been updated to reflect that the connection is dead. According to the libpq people, that's because you didn't repeat PGgetResult() until it returned NULL. I guess on an administrative restart of the server, all connections are notified of this, so there's I/O pending for read on the socket at the client. odbx_result() causes this message to be read, but EOF isn't reached yet so PQstatus() continues to show the connection as usable. So apparently you have to call PGgetResult() again anyway, even though PGgetResultStatus() has indicated a fatal error. I agree that this seems strange; I likened it to select() returning an indication that some descriptor is read-ready but also has an exception, and then stating that the user needs to call read() repeatedly until EOF just to get errno to tell you what happened. So unless they have a change of heart and actually fix this, it sounds like you're going to have to call PGgetResult() repeatedly, caching all the possible results, on the first call to odbx_result(), and then pull them out one at a time when the user calls it again. That's the only way PQstatus() will tell you the truth on a fatal error. Or, you could "pass the buck" and require your users to do the same thing as libpq, namely keep calling odbx_result() until the end of the result set is reached, so that you get the "true" PQstatus() value and then the user can use odbx_error_type() with correct results. Neither solution is especially pretty. > Thanks for your help My pleasure. Sorry for the bad news. :-) -MSK ------------------------------------------------------------------------------ Special Offer-- Download ArcSight Logger for FREE (a $49 USD value)! Finally, a world-class log management solution at an even better price-free! Download using promo code Free_Logger_4_Dev2Dev. Offer expires February 28th, so secure your free ArcSight Logger TODAY! http://p.sf.net/sfu/arcsight-sfd2d _______________________________________________ libopendbx-devel mailing list libopendbx-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/libopendbx-devel http://www.linuxnetworks.de/doc/index.php/OpenDBX