Robert Haas <robertmh...@gmail.com> wrote: > What does bother me is the fact that we are engineering a critical > aspect of our system reliability around vendor-specific > implementation details of the TCP stack, and that if any version > of any operating system that we support (or ever wish to support > in the future) fails to have a reliable implementation of this > feature AND configurable knobs that we can tune to suit our needs, > then we're screwed. Does anyone want to argue that this is NOT a > house of cards? [/me raises hand] TCP keepalive has been available and a useful part of my reliability solutions since I had so find a way to clean up zombie database connections caused by clients powering down their workstations without closing their apps -- that was in OS/2 circa 1990. I'm pretty sure I've also used it on HP-UX, whatever Unix flavor was on our Sun SPARC servers, several versions of Windows, and several versions of Linux. As far as I can recall, the default was always two hours before doing anything, followed by nine small packets sent over the course of ten minutes before giving up (if none were answered). I'm not sure whether the timings were controllable through the applications, because we generally changed the OS defaults. Even so, recovery after two hours and ten minutes is way better than waiting for eternity. As someone else said, we may want to add some sort of keepalive- style ping to our application's home-grown protocol; but I don't see that as an argument to suppress a very widely supported standard protocol. These address slightly different problem sets, let's solve the one that came up in testing for the vast majority of runtime environments by turning on TCP keepalives. No, I don't see it as a house of cards. -Kevin
-- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers