Where are we on this?  I do think it solves a problem that some are
having, and it seems it would detect a disconnected client and abort a
long running query.

I am not very excited about adding four more GUC variables, and I am
thinking we could just have it use the OS defaults and see if we need
more later, so that would add only one GUC.

It compiles/tests fine on my BSD system. 

---------------------------------------------------------------------------

Oliver Jowett wrote:
> Tom Lane wrote:
> > Oliver Jowett <[EMAIL PROTECTED]> writes:
> > 
> >>Tom Lane wrote:
> >>
> >>>I'm not convinced that Postgres ought to provide
> >>>a way to second-guess the TCP stack ...
> > 
> > 
> >>Would you be ok with a patch that allowed configuration of the 
> >>TCP_KEEPCNT / TCP_KEEPIDLE / TCP_KEEPINTVL socket options on backend 
> >>sockets?
> > 
> > 
> > [ shrug... ]  As long as it doesn't fail to build on platforms that
> > don't offer those options, I couldn't complain too hard.  But do we
> > really need all that?
> 
> Here's a patch that adds four new GUCs:
> 
>   tcp_keepalives (defaults to on, controls SO_KEEPALIVE)
>   tcp_keepalives_idle (controls TCP_KEEPIDLE)
>   tcp_keepalives_interval (controls TCP_KEEPINTVL)
>   tcp_keepalives_count (controls TCP_KEEPCNT)
> 
> They're ignored for AF_UNIX connections and when running standalone.
> 
> tcp_keepalives_* treat 0 as "use system default". If the underlying OS
> doesn't provide the TCP_KEEP* socket options, the GUCs are present but
> reject any value other than 0.
> 
> SHOW reflects the currently-in-use value or 0 if not applicable or not
> known. i.e. if you set it to 0 and you have the socket options
> available, SHOW will show the result of getsockopt() which is non-zero.
> 
> A default install on my Linux system produces:
> 
> template1=# show all;
>               name              |                   setting
> 
> --------------------------------+----------------------------------------------
> [...]
>  tcp_keepalives                 | on
>  tcp_keepalives_count           | 9
>  tcp_keepalives_idle            | 7200
>  tcp_keepalives_interval        | 75
> [...]
> 
> I haven't had a chance to check it builds on other systems or to test
> that this handles actual network failures nicely yet.
> 
> -O

> Index: doc/src/sgml/runtime.sgml
> ===================================================================
> RCS file: /projects/cvsroot/pgsql/doc/src/sgml/runtime.sgml,v
> retrieving revision 1.315
> diff -c -r1.315 runtime.sgml
> *** doc/src/sgml/runtime.sgml 23 Apr 2005 03:27:40 -0000      1.315
> --- doc/src/sgml/runtime.sgml 3 May 2005 01:44:02 -0000
> ***************
> *** 889,894 ****
> --- 889,961 ----
>         </listitem>
>        </varlistentry>
>        
> +      <varlistentry id="guc-tcp-keepalives" xreflabel="tcp_keepalives">
> +       <term><varname>tcp_keepalives</varname> (<type>boolean</type>)</term>
> +       <indexterm>
> +        <primary><varname>tcp_keepalives</> configuration parameter</primary>
> +       </indexterm>
> +       <listitem>
> +        <para>
> +         Controls the use of TCP keepalives on client connections. When 
> enabled,
> +         idle connections will be periodically probed to check that the 
> client
> +         is still present. If sufficient probes are lost, the connection will
> +         be broken. This option is ignored for connections made via a
> +         Unix-domain socket.
> +        </para>
> +       </listitem>
> +      </varlistentry>
> +      
> +      <varlistentry id="guc-tcp-keepalives-idle" 
> xreflabel="tcp_keepalives_idle">
> +       <term><varname>tcp_keepalives_idle</varname> 
> (<type>integer</type>)</term>
> +       <indexterm>
> +        <primary><varname>tcp_keepalives_idle</> configuration 
> parameter</primary>
> +       </indexterm>
> +       <listitem>
> +        <para>
> +         On systems that support the TCP_KEEPIDLE socket option, specifies 
> the
> +         number of seconds between sending keepalives on an otherwise idle
> +         connection. A value of 0 uses the system default. If TCP_KEEPIDLE is
> +         not supported, this parameter must be 0. This option is ignored for
> +         connections made via a Unix-domain socket, and will have no effect
> +         unless the <varname>tcp_keepalives</varname> option is enabled.
> +        </para>
> +       </listitem>
> +      </varlistentry>
> +      
> +      <varlistentry id="guc-tcp-keepalives-interval" 
> xreflabel="tcp_keepalives_interval">
> +       <term><varname>tcp_keepalives_interval</varname> 
> (<type>integer</type>)</term>
> +       <indexterm>
> +        <primary><varname>tcp_keepalives_interval</> configuration 
> parameter</primary>
> +       </indexterm>
> +       <listitem>
> +        <para>
> +         On systems that support the TCP_KEEPINTVL socket option, specifies 
> how
> +         long, in seconds, to wait for a response to a keepalive before
> +         retransmitting. A value of 0 uses the system default. If 
> TCP_KEEPINTVL
> +         is not supported, this parameter must be 0. This option is ignored
> +         for connections made via a Unix-domain socket, and will have no 
> effect
> +         unless the <varname>tcp_keepalives</varname> option is enabled.
> +        </para>
> +       </listitem>
> +      </varlistentry>
> +      
> +      <varlistentry id="guc-tcp-keepalives-count" 
> xreflabel="tcp_keepalives_count">
> +       <term><varname>tcp_keepalives_count</varname> 
> (<type>integer</type>)</term>
> +       <indexterm>
> +        <primary><varname>tcp_keepalives_count</> configuration 
> parameter</primary>
> +       </indexterm>
> +       <listitem>
> +        <para>
> +         On systems that support the TCP_KEEPCNT socket option, specifies how
> +         many keepalives may be lost before the connection is considered 
> dead. 
> +         A value of 0 uses the system default. If TCP_KEEPINTVL is not
> +         supported, this parameter must be 0. This option is ignored for
> +         connections made via a Unix-domain socket, and will have no effect
> +         unless the <varname>tcp_keepalives</varname> option is enabled.
> +        </para>
> +       </listitem>
> +      </varlistentry>
> +      
>        </variablelist>
>        </sect3>
>        <sect3 id="runtime-config-connection-security">
> Index: src/backend/libpq/pqcomm.c
> ===================================================================
> RCS file: /projects/cvsroot/pgsql/src/backend/libpq/pqcomm.c,v
> retrieving revision 1.176
> diff -c -r1.176 pqcomm.c
> *** src/backend/libpq/pqcomm.c        22 Feb 2005 04:35:57 -0000      1.176
> --- src/backend/libpq/pqcomm.c        3 May 2005 01:44:03 -0000
> ***************
> *** 87,93 ****
>   #include "libpq/libpq.h"
>   #include "miscadmin.h"
>   #include "storage/ipc.h"
> ! 
>   
>   /*
>    * Configuration options
> --- 87,93 ----
>   #include "libpq/libpq.h"
>   #include "miscadmin.h"
>   #include "storage/ipc.h"
> ! #include "utils/guc.h"
>   
>   /*
>    * Configuration options
> ***************
> *** 577,582 ****
> --- 577,583 ----
>       if (!IS_AF_UNIX(port->laddr.addr.ss_family))
>       {
>               int                     on;
> +             socklen_t   size;
>   
>   #ifdef      TCP_NODELAY
>               on = 1;
> ***************
> *** 587,599 ****
>                       return STATUS_ERROR;
>               }
>   #endif
> !             on = 1;
> !             if (setsockopt(port->sock, SOL_SOCKET, SO_KEEPALIVE,
> !                                        (char *) &on, sizeof(on)) < 0)
>               {
> !                     elog(LOG, "setsockopt(SO_KEEPALIVE) failed: %m");
>                       return STATUS_ERROR;
>               }
>       }
>   
>       return STATUS_OK;
> --- 588,645 ----
>                       return STATUS_ERROR;
>               }
>   #endif
> ! 
> !             if (pq_setkeepalives(tcp_keepalives, port) != STATUS_OK)
> !                     return STATUS_ERROR;
> ! 
> !             /* Grab default keepalive values, then apply
> !              * our GUC settings.
> !              */
> ! #ifdef TCP_KEEPIDLE
> !             if (getsockopt(port->sock, SOL_TCP, TCP_KEEPIDLE,
> !                                        (char *) 
> &port->default_keepalives_idle, &size) < 0)
> !             {
> !                     elog(LOG, "getsockopt(TCP_KEEPIDLE) failed: %m");
> !                     return STATUS_ERROR;
> !             }
> ! #else
> !             port->default_keepalives_idle = 0;
> ! #endif
> ! 
> ! #ifdef TCP_KEEPINTVL
> !             if (getsockopt(port->sock, SOL_TCP, TCP_KEEPINTVL,
> !                                        (char *) 
> &port->default_keepalives_interval, &size) < 0)
>               {
> !                     elog(LOG, "getsockopt(TCP_KEEPINTVL) failed: %m");
>                       return STATUS_ERROR;
>               }
> + #else
> +             port->default_keepalives_idle = 0;
> + #endif
> + 
> + #ifdef TCP_KEEPCNT
> +             if (getsockopt(port->sock, SOL_TCP, TCP_KEEPCNT,
> +                                        (char *) 
> &port->default_keepalives_count, &size) < 0)
> +             {
> +                     elog(LOG, "getsockopt(TCP_KEEPCNT) failed: %m");
> +                     return STATUS_ERROR;
> +             }
> + #else
> +             port->default_keepalives_idle = 0;
> + #endif
> +             
> +             /* Set default keepalive parameters. This should also catch
> +              * misconfigurations (non-zero values when socket options aren't
> +              * supported)
> +              */
> +             if (pq_setkeepalivesidle(tcp_keepalives_idle, port) != 
> STATUS_OK)
> +                     return STATUS_ERROR;
> + 
> +             if (pq_setkeepalivesinterval(tcp_keepalives_interval, port) != 
> STATUS_OK)
> +                     return STATUS_ERROR;
> + 
> +             if (pq_setkeepalivescount(tcp_keepalives_count, port) != 
> STATUS_OK)
> +                     return STATUS_ERROR;
>       }
>   
>       return STATUS_OK;
> ***************
> *** 1158,1160 ****
> --- 1204,1305 ----
>       /* in non-error case, copy.c will have emitted the terminator line */
>       DoingCopyOut = false;
>   }
> + 
> + int
> + pq_setkeepalives(bool onoff, Port *port)
> + {
> +     int on = (onoff ? 1 : 0);
> + 
> +     if (IS_AF_UNIX(port->laddr.addr.ss_family))
> +             return STATUS_OK;
> + 
> +     if (setsockopt(port->sock, SOL_SOCKET, SO_KEEPALIVE,
> +                                (char *) &on, sizeof(on)) < 0)
> +     {
> +             elog(LOG, "setsockopt(SO_KEEPALIVE) failed: %m");
> +             return STATUS_ERROR;
> +     }
> + 
> +     return STATUS_OK;
> + }
> + 
> + int
> + pq_setkeepalivesidle(int idle, Port *port)
> + {
> +     if (IS_AF_UNIX(port->laddr.addr.ss_family))
> +             return STATUS_OK;
> + 
> + #ifdef TCP_KEEPIDLE
> +     if (idle == 0)
> +             idle = port->default_keepalives_idle;
> + 
> +     if (setsockopt(port->sock, SOL_TCP, TCP_KEEPIDLE,
> +                                (char *) &idle, sizeof(idle)) < 0)
> +     {
> +             elog(LOG, "setsockopt(TCP_KEEPIDLE) failed: %m");
> +             return STATUS_ERROR;
> +     }
> + #else
> +     if (idle != 0)
> +     {
> +             elog(LOG, "setsockopt(TCP_KEEPIDLE) not supported");
> +             return STATUS_ERROR;
> +     }
> + #endif
> + 
> +     return STATUS_OK;
> + }
> + 
> + int
> + pq_setkeepalivesinterval(int interval, Port *port)
> + {
> +     if (IS_AF_UNIX(port->laddr.addr.ss_family))
> +             return STATUS_OK;
> + 
> + #ifdef TCP_KEEPINTVL
> +     if (interval == 0)
> +             interval = port->default_keepalives_interval;
> + 
> +     if (setsockopt(port->sock, SOL_TCP, TCP_KEEPINTVL,
> +                                (char *) &interval, sizeof(interval)) < 0)
> +     {
> +             elog(LOG, "setsockopt(TCP_KEEPINTVL) failed: %m");
> +             return STATUS_ERROR;
> +     }
> + #else
> +     if (interval != 0)
> +     {
> +             elog(LOG, "setsockopt(TCP_KEEPINTVL) not supported");
> +             return STATUS_ERROR;
> +     }               
> + #endif
> + 
> +     return STATUS_OK;
> + }
> + 
> + int
> + pq_setkeepalivescount(int count, Port *port)
> + {
> +     if (IS_AF_UNIX(port->laddr.addr.ss_family))
> +             return STATUS_OK;
> + 
> + #ifdef TCP_KEEPCNT
> +     if (count == 0)
> +             count = port->default_keepalives_count;
> + 
> +     if (setsockopt(port->sock, SOL_TCP, TCP_KEEPCNT,
> +                                (char *) &count, sizeof(count)) < 0)
> +     {
> +             elog(LOG, "setsockopt(TCP_KEEPCNT) failed: %m");
> +             return STATUS_ERROR;
> +     }
> + #else
> +     if (count != 0)
> +     {
> +             elog(LOG, "setsockopt(TCP_KEEPCNT) not supported");
> +             return STATUS_ERROR;
> +     }
> + #endif
> + 
> +     return STATUS_OK;
> + }
> Index: src/backend/utils/misc/guc.c
> ===================================================================
> RCS file: /projects/cvsroot/pgsql/src/backend/utils/misc/guc.c,v
> retrieving revision 1.261
> diff -c -r1.261 guc.c
> *** src/backend/utils/misc/guc.c      1 May 2005 18:56:19 -0000       1.261
> --- src/backend/utils/misc/guc.c      3 May 2005 01:44:08 -0000
> ***************
> *** 113,118 ****
> --- 113,125 ----
>   static bool assign_transaction_read_only(bool newval, bool doit, GucSource 
> source);
>   static const char *assign_canonical_path(const char *newval, bool doit, 
> GucSource source);
>   
> + static bool assign_tcp_keepalives(bool newval, bool doit, GucSource source);
> + static bool assign_tcp_keepalives_idle(int newval, bool doit, GucSource 
> source);
> + static bool assign_tcp_keepalives_interval(int newval, bool doit, GucSource 
> source);
> + static bool assign_tcp_keepalives_count(int newval, bool doit, GucSource 
> source);
> + static const char *show_tcp_keepalives_idle(void);
> + static const char *show_tcp_keepalives_interval(void);
> + static const char *show_tcp_keepalives_count(void);
>   
>   /*
>    * GUC option variables that are exported from this module
> ***************
> *** 154,159 ****
> --- 161,170 ----
>   char           *IdentFileName;
>   char           *external_pid_file;
>   
> + bool        tcp_keepalives = true;
> + int         tcp_keepalives_idle;
> + int         tcp_keepalives_interval;
> + int         tcp_keepalives_count;
>   
>   /*
>    * These variables are all dummies that don't do anything, except in some
> ***************
> *** 860,865 ****
> --- 871,885 ----
>   #endif
>       },
>   
> +     {
> +             {"tcp_keepalives", PGC_USERSET, CLIENT_CONN_OTHER,
> +                  gettext_noop("Use keepalives on client TCP connections."),
> +                  NULL,
> +             },              
> +             &tcp_keepalives,
> +             true, assign_tcp_keepalives, NULL
> +     },
> + 
>       /* End-of-list marker */
>       {
>               {NULL, 0, 0, NULL, NULL}, NULL, false, NULL, NULL
> ***************
> *** 1333,1338 ****
> --- 1353,1387 ----
>               BLCKSZ, BLCKSZ, BLCKSZ, NULL, NULL
>       },
>   
> +     {
> +             {"tcp_keepalives_idle", PGC_USERSET, CLIENT_CONN_OTHER,
> +                  gettext_noop("Seconds between issuing TCP keepalives."),
> +                  gettext_noop("A value of 0 uses the system default."),
> +             },              
> +             &tcp_keepalives_idle,
> +             0, 0, INT_MAX, assign_tcp_keepalives_idle, 
> show_tcp_keepalives_idle
> +     },
> + 
> +     {
> +             {"tcp_keepalives_interval", PGC_USERSET, CLIENT_CONN_OTHER,
> +                  gettext_noop("Seconds between TCP keepalive retransmits."),
> +                  gettext_noop("A value of 0 uses the system default."),
> +             },              
> +             &tcp_keepalives_interval,
> +             0, 0, INT_MAX, assign_tcp_keepalives_interval, 
> show_tcp_keepalives_interval
> +     },
> + 
> +     {
> +             {"tcp_keepalives_count", PGC_USERSET, CLIENT_CONN_OTHER,
> +                  gettext_noop("Maximum number of TCP keepalive 
> retransmits."),
> +                  gettext_noop("This controls the number of consecutive 
> keepalive retransmits that can be "
> +                                               "lost before a connection is 
> considered dead. A value of 0 uses the "
> +                                               "system default."),
> +             },              
> +             &tcp_keepalives_count,
> +             0, 0, INT_MAX, assign_tcp_keepalives_count, 
> show_tcp_keepalives_count
> +     },
> + 
>       /* End-of-list marker */
>       {
>               {NULL, 0, 0, NULL, NULL}, NULL, 0, 0, 0, NULL, NULL
> ***************
> *** 5677,5681 ****
> --- 5726,5802 ----
>               return newval;
>   }
>   
> + static bool
> + assign_tcp_keepalives(bool newval, bool doit, GucSource source)
> + {
> +     if (doit && MyProcPort != NULL)
> +     {
> +             pq_setkeepalives(newval, MyProcPort);
> +     }
> +     return true;
> + }
> + 
> + static bool
> + assign_tcp_keepalives_idle(int newval, bool doit, GucSource source)
> + {
> +     if (doit && MyProcPort != NULL)
> +     {
> +             return (pq_setkeepalivesidle(newval, MyProcPort) == STATUS_OK);
> +     }
> + 
> +     return true;
> + }
> + 
> + static const char *
> + show_tcp_keepalives_idle(void)
> + {
> +     static char nbuf[32];
> +     snprintf(nbuf, sizeof(nbuf), "%d",
> +                      (tcp_keepalives_idle != 0 || MyProcPort == NULL)
> +                      ? tcp_keepalives_idle : 
> MyProcPort->default_keepalives_idle);
> +     return nbuf;
> + }
> + 
> + static bool
> + assign_tcp_keepalives_interval(int newval, bool doit, GucSource source)
> + {
> +     if (doit && MyProcPort != NULL)
> +     {
> +             return (pq_setkeepalivesinterval(newval, MyProcPort) == 
> STATUS_OK);
> +     }
> + 
> +     return true;
> + }
> + 
> + static const char *
> + show_tcp_keepalives_interval(void)
> + {
> +     static char nbuf[32];
> +     snprintf(nbuf, sizeof(nbuf), "%d",
> +                      (tcp_keepalives_interval != 0 || MyProcPort == NULL)
> +                      ? tcp_keepalives_interval : 
> MyProcPort->default_keepalives_interval);
> +     return nbuf;
> + }
> + 
> + static bool
> + assign_tcp_keepalives_count(int newval, bool doit, GucSource source)
> + {
> +     if (doit && MyProcPort != NULL)
> +     {
> +             return (pq_setkeepalivescount(newval, MyProcPort) == STATUS_OK);
> +     }
> + 
> +     return true;
> + }
> + 
> + static const char *
> + show_tcp_keepalives_count(void)
> + {
> +     static char nbuf[32];
> +     snprintf(nbuf, sizeof(nbuf), "%d",
> +                      (tcp_keepalives_count != 0 || MyProcPort == NULL)
> +                      ? tcp_keepalives_count : 
> MyProcPort->default_keepalives_count);
> +     return nbuf;
> + }
>   
>   #include "guc-file.c"
> Index: src/backend/utils/misc/postgresql.conf.sample
> ===================================================================
> RCS file: 
> /projects/cvsroot/pgsql/src/backend/utils/misc/postgresql.conf.sample,v
> retrieving revision 1.140
> diff -c -r1.140 postgresql.conf.sample
> *** src/backend/utils/misc/postgresql.conf.sample     21 Apr 2005 19:18:13 
> -0000      1.140
> --- src/backend/utils/misc/postgresql.conf.sample     3 May 2005 01:44:08 
> -0000
> ***************
> *** 66,71 ****
> --- 66,77 ----
>   #krb_server_keyfile = ''
>   #db_user_namespace = false
>   
> + # - TCP Keepalives -
> + # see 'man 7 tcp' for details
> + #tcp_keepalives = on            # enables use of TCP keepalives
> + #tcp_keepalives_idle = 0        # TCP_KEEPIDLE, in seconds; 0 uses the 
> system default.
> + #tcp_keepalives_interval = 0    # TCP_KEEPINTVL, in seconds; 0 uses the 
> system default.
> + #tcp_keepalives_count = 0       # TCP_KEEPCNT, in seconds; 0 uses the 
> system default.
>   
>   #---------------------------------------------------------------------------
>   # RESOURCE USAGE (except WAL)
> Index: src/bin/psql/tab-complete.c
> ===================================================================
> RCS file: /projects/cvsroot/pgsql/src/bin/psql/tab-complete.c,v
> retrieving revision 1.125
> diff -c -r1.125 tab-complete.c
> *** src/bin/psql/tab-complete.c       21 Apr 2005 19:18:13 -0000      1.125
> --- src/bin/psql/tab-complete.c       3 May 2005 01:44:10 -0000
> ***************
> *** 595,600 ****
> --- 595,604 ----
>               "superuser_reserved_connections",
>               "syslog_facility",
>               "syslog_ident",
> +             "tcp_keepalives",
> +             "tcp_keepalives_idle",
> +             "tcp_keepalives_interval",
> +             "tcp_keepalives_count",
>               "temp_buffers",
>               "TimeZone",
>               "trace_notify",
> Index: src/include/libpq/libpq-be.h
> ===================================================================
> RCS file: /projects/cvsroot/pgsql/src/include/libpq/libpq-be.h,v
> retrieving revision 1.49
> diff -c -r1.49 libpq-be.h
> *** src/include/libpq/libpq-be.h      31 Dec 2004 22:03:32 -0000      1.49
> --- src/include/libpq/libpq-be.h      3 May 2005 01:44:10 -0000
> ***************
> *** 92,100 ****
> --- 92,113 ----
>       char            peer_cn[SM_USER + 1];
>       unsigned long count;
>   #endif
> + 
> +     /*
> +      * Default TCP keepalive values found after accept(); 0 if unsupported 
> or AF_UNIX.
> +      */
> +     int         default_keepalives_idle;
> +     int         default_keepalives_interval;
> +     int         default_keepalives_count;
>   } Port;
>   
>   
>   extern ProtocolVersion FrontendProtocol;
>   
> + /* TCP keepalives configuration. These are no-ops on an AF_UNIX socket. */
> + extern int pq_setkeepalives(bool onoff, Port *port);
> + extern int pq_setkeepalivesidle(int idle, Port *port);
> + extern int pq_setkeepalivesinterval(int interval, Port *port);
> + extern int pq_setkeepalivescount(int count, Port *port);
> + 
>   #endif   /* LIBPQ_BE_H */
> Index: src/include/utils/guc.h
> ===================================================================
> RCS file: /projects/cvsroot/pgsql/src/include/utils/guc.h,v
> retrieving revision 1.60
> diff -c -r1.60 guc.h
> *** src/include/utils/guc.h   25 Mar 2005 16:17:28 -0000      1.60
> --- src/include/utils/guc.h   3 May 2005 01:44:10 -0000
> ***************
> *** 133,138 ****
> --- 133,142 ----
>   extern char *IdentFileName;
>   extern char *external_pid_file;
>   
> + extern bool tcp_keepalives;
> + extern int  tcp_keepalives_idle;
> + extern int  tcp_keepalives_interval;
> + extern int  tcp_keepalives_count;
>   
>   extern void SetConfigOption(const char *name, const char *value,
>                               GucContext context, GucSource source);

> 
> ---------------------------(end of broadcast)---------------------------
> TIP 5: Have you checked our extensive FAQ?
> 
>                http://www.postgresql.org/docs/faq

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073

---------------------------(end of broadcast)---------------------------
TIP 4: Don't 'kill -9' the postmaster

Reply via email to