Oliver Jowett <[EMAIL PROTECTED]> writes: > The scenario I need to deal with is this:
> There are multiple nodes, network-separated, participating in a cluster. > One node is selected to talk to a particular postgresql instance (call > this node A). > A starts a transaction and grabs some locks in the course of that > transaction. Then A falls off the network before committing because of a > hardware or network failure. A's connection might be completely idle > when this happens. > The cluster liveness machinery notices that A is dead and selects a new > node to talk to postgresql (call this node B). B resumes the work that A > was doing prior to failure. > B has to wait for any locks held by A to be released before it can make > any progress. > Without some sort of tunable timeout, it could take a very long time (2+ > hours by default on Linux) before A's connection finally times out and > releases the locks. Wouldn't it be reasonable to expect the "cluster liveness machinery" to notify the database server's kernel that connections to A are now dead? I find it really unconvincing to suppose that the above problem should be solved at the database level. regards, tom lane ---------------------------(end of broadcast)--------------------------- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])