[soci-devel] Failover mechanism

Vadim Zeitlin Thu, 05 Mar 2020 06:37:14 -0800

 Hello,

 I'd like to understand how is failover mechanism, added back in
3daef94c7aa463be21b2d428eae2b5b21f10a6aa, supposed to work. My goal is to
generalize its implementation for PostgreSQL to all (non-Oracle) databases,
as the code in src/backends/postgresql/error.cpp is really not
PostgreSQL-specific at all, but there are a couple of problems with this.


 First one is https://github.com/SOCI/soci/issues/793, i.e. that the
current implementation in PostgreSQL backend doesn't use the correct
connection string when trying to reconnect. This is not a huge problem on
its own, and I've already fixed it locally, but it made it impossible to
test this and so makes me suspicious about whether it was actually tested
at all with this database.

 Second problem is that for me failover means transparently continuing
running even if the database becomes (temporarily) inaccessible. In
particular, I thought the failover mechanism would re-attempt executing the
SQL statement which resulted in an error. However this isn't the case
currently in PostgreSQL backend at all. So I wonder if it's "just" another
bug in this code (which seems relatively plausible, considering the absence
of testing mentioned above) or if my assumption is wrong and this is not
how it's supposed to work at all?

 This is partially related to the third problem/question: does anybody know
how to test this mechanism with Oracle, which is the only backend with
native support for this functionality? I've naively thought that just
breaking connection to the server would be enough to trigger it, but what
actually happens is that attempting to execute any SQL statement when
connection is unavailable (I've just added iptables rule on the local
router rejecting packets to the remote server to emulate this) simply hangs
apparently forever -- or at least beyond the time I was prepared to wait
(several minutes). Does anybody know how is this supposed to work and what
kind of configuration do I need (either client or server-side) to test
failover with Oracle?


 Finally, I wonder if it's really even worth spending time on this or if it
would, perhaps, be better to just provide a ping() method checking if the
connection is still alive and let the application use it when handling SOCI
exceptions to try to reconnect on its own, if necessary? This would surely
be much simpler and the only advantage of the current failover mechanism I
see is that it's natively supported by Oracle. But without knowing how does
it work there (and, in particular, whether the statements are still run
if a failover happens during their execution), I don't know if it's really
important. I.e. if Oracle doesn't run the statement to the completion
neither and it's up to either SOCI or the application to do it anyhow, I
don't see much point in using this mechanism instead of simply reconnecting
when the connection is lost.


 Thanks in advance for any help/hints!
VZ

pgpV7sat3rI_r.pgp
Description: PGP signature

_______________________________________________
soci-devel mailing list
soci-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/soci-devel

[soci-devel] Failover mechanism

Reply via email to