There are actually two issues sitting here:

1. Transaction Recovery and

2. Knowing when you've lost a connection and when to restart it.

I am not familiar with PostgreSQL but with a transaction oriented database such as Firebird or Oracle, when you lose a connection, re-connecting doesn't also imply recovering the transaction. Your transaction may be in limbo and require manual rollback or commit depending on what is the most desirable outcome and, until you do this your database may not be in a consistent state - depending on the application. It gets even more complication with transactions across multiple databases when two phase commit issues appear.

It is also not always obvious when you have lost a connection. TCP depends on both retransmission and inactivity timers to detect connection loss and some implementations don't even detect connection loss during periods of inactivity and only detect the loss when no reply is received after several retries. In short, there can be a long time between connection loss and it being noticed by either the server or the client - and those events may be well separated in time. Indeed, the user may already be on the line to the help deskĀ  complaining that their computer is no longer responding, longĀ  before the lost connection error message gets displayed.

The bottom line is that neither detecting connection loss nor recovering from it is a simple matter. In any serious database application, you need to think about how responsive you need to be to connection loss, and how to recover from it. How quickly you need to detect it and then once detected, what is the recovery strategy. Will it require a database administrator action to rollback or commit outstanding transactions? Is it appropriate to always rollback limbo transactions, or do you need to decide the appropriate recovery on a case by case basis?

There is no "one size fits all" answer to the problem. The ideal is that there are no lost connections, except in extreme circumstances such as hardware failure. Automatic updates may seem a good idea, but sometimes it's better to plan and schedule upgrades during planned outages rather than letting them happen when you least want them.


On 11/11/17 18:41, Mark Morgan Lloyd wrote:
Graeme started a short thread on databases in "The Other Place" a few days ago, but I thought this might be of sufficient general relevance to raise here.

I had a system outage this morning, with all apps suddenly losing their connectivity to the PostgreSQL server.

It turned out that the cause of that was that Debian had done an unattended upgrade of the Postgres server, and by restarting it had killed all persistent connections. There is no "Can we kill your database when we feel like it?" question during Debian installation.

I anticipate that the same problem will affect other databases or software to which a client program maintains a persistent session, unless explicit steps are taken to recognise and recover from a server-side restart.

Noting that the traditional way of using the data-aware controls introduced by Delphi etc., is particularly vulnerable, and noting that the FPC/Lazarus controls do a good job of presenting a common API irrespective of what backend server is being used, would it be feasible to have a "reconnect monitor" or similar to help recover from this sort of thing?


_______________________________________________
fpc-pascal maillist  -  fpc-pascal@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-pascal

Reply via email to