Re: [HACKERS] TCP keepalive support for libpq

2010-06-24 Thread Tom Lane
Greg Stark writes: > I think it's going to be an uphill battle convincing TCP that we know > better than the TCP spec about how aggressive it should be about > throwing errors and killing connections. Once we have TCP keepalives > set low enough -- assuming the OS will allow it to be set much lowe

Re: [HACKERS] TCP keepalive support for libpq

2010-06-24 Thread Kevin Grittner
Greg Stark wrote: > we'll find that other timeouts are longer than we expect too. TCP > Keepalives won't come into it at all if there is any unacked data > pending -- TCP *will* detect that case but it might take longer > than you want too and you won't be able to lower it. If memory servers a

Re: [HACKERS] TCP keepalive support for libpq

2010-06-24 Thread Greg Stark
On Tue, Jun 22, 2010 at 6:04 PM, Kevin Grittner wrote: > Robert Haas wrote: > >> What does bother me is the fact that we are engineering a critical >> aspect of our system reliability around vendor-specific >> implementation details of the TCP stack, and that if any version >> of any operating sy

Re: [HACKERS] TCP keepalive support for libpq

2010-06-24 Thread Magnus Hagander
On Thu, Jun 24, 2010 at 03:14, Robert Haas wrote: > On Tue, Jun 22, 2010 at 12:32 PM, Magnus Hagander wrote: >> I looked around quickly earlier when we chatted about this, and I >> think I found an API call to change them for a socket as well - but a >> Windows specific one, not the ones you'd fi

Re: [HACKERS] TCP keepalive support for libpq

2010-06-23 Thread Robert Haas
On Tue, Jun 22, 2010 at 12:32 PM, Magnus Hagander wrote: > I looked around quickly earlier when we chatted about this, and I > think I found an API call to change them for a socket as well - but a > Windows specific one, not the ones you'd find on Unix... Magnus - or anyone who knows Windows - N

Re: [HACKERS] TCP keepalive support for libpq

2010-06-23 Thread Robert Haas
On Wed, Jun 23, 2010 at 4:56 PM, Tom Lane wrote: > Fujii Masao writes: >> On Wed, Jun 23, 2010 at 5:32 AM, Robert Haas wrote: >>> OK, here's a new version with several fewer bugs. > >> Since valid values for keepalives parameter are 0 and 1, its field size >> should >> be 1 rather than 10. > >

Re: [HACKERS] TCP keepalive support for libpq

2010-06-23 Thread Tom Lane
Fujii Masao writes: > On Wed, Jun 23, 2010 at 5:32 AM, Robert Haas wrote: >> OK, here's a new version with several fewer bugs. > Since valid values for keepalives parameter are 0 and 1, its field size should > be 1 rather than 10. Right ... although maybe it should be considered a boolean and n

Re: [HACKERS] TCP keepalive support for libpq

2010-06-22 Thread Fujii Masao
On Wed, Jun 23, 2010 at 5:32 AM, Robert Haas wrote: > On Tue, Jun 22, 2010 at 3:45 PM, Robert Haas wrote: >> On Tue, Jun 22, 2010 at 3:28 PM, Robert Haas wrote: >>> Either I'm doing something wrong, >> >> I think it's this one.  Stand by. > > OK, here's a new version with several fewer bugs. Si

Re: [HACKERS] TCP keepalive support for libpq

2010-06-22 Thread Robert Haas
On Tue, Jun 22, 2010 at 3:45 PM, Robert Haas wrote: > On Tue, Jun 22, 2010 at 3:28 PM, Robert Haas wrote: >> Either I'm doing something wrong, > > I think it's this one.  Stand by. OK, here's a new version with several fewer bugs. This does appear to work on both Linux and MacOS now, which are

Re: [HACKERS] TCP keepalive support for libpq

2010-06-22 Thread Robert Haas
On Tue, Jun 22, 2010 at 3:28 PM, Robert Haas wrote: > Either I'm doing something wrong, I think it's this one. Stand by. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make chan

Re: [HACKERS] TCP keepalive support for libpq

2010-06-22 Thread Robert Haas
On Tue, Jun 22, 2010 at 1:32 PM, Robert Haas wrote: > I don't think the burden of proof is on me to demonstrate that there's > a case where this feature isn't available - we're usually quite > reluctant to take advantage of platform-specific features unless we > have strong evidence that they are

Re: [HACKERS] TCP keepalive support for libpq

2010-06-22 Thread Josh Berkus
> In what environment do you see that causing a problem (compared to > no keepalive)? If it were Alpha3 right now, I'd have no issue with it, and if we're talking about it for 9.1 I'd have no issue with it. I am, however, extremely reluctant to introduce a default behavior change for Beta3. --

Re: [HACKERS] TCP keepalive support for libpq

2010-06-22 Thread Kevin Grittner
Josh Berkus wrote: > We might not want to enable them by default, though. I have a hard time believing that "enabled by default" is a problem with the default timings. That would result in sending and receiving one small packet every two hours on an open connection with no application traffic

Re: [HACKERS] TCP keepalive support for libpq

2010-06-22 Thread Josh Berkus
All, If we *don't* rely on tcp-keepalive for terminating SR connections where the master is dead, what is the alternative? That issue, IMHO, is a blocker for 9.0. If tcp-keepalives are the only idea we have, then we need to work around the limitations and implement them. I'll also point out tha

Re: [HACKERS] TCP keepalive support for libpq

2010-06-22 Thread Robert Haas
On Tue, Jun 22, 2010 at 1:14 PM, Tom Lane wrote: > Robert Haas writes: >> On Tue, Jun 22, 2010 at 12:50 PM, Tom Lane wrote: >>> Robert Haas writes: >>> By that argument, we need to be programming to bare metal on every disk >>> access.  Does anyone want to argue that depending on vendor-specifi

Re: [HACKERS] TCP keepalive support for libpq

2010-06-22 Thread Florian Pflug
On Jun 22, 2010, at 18:43 , Robert Haas wrote: > What does bother me is the fact that we are engineering a critical > aspect of our system reliability around vendor-specific implementation > details of the TCP stack, and that if any version of any operating > system that we support (or ever wish to

Re: [HACKERS] TCP keepalive support for libpq

2010-06-22 Thread Tom Lane
Robert Haas writes: > On Tue, Jun 22, 2010 at 12:50 PM, Tom Lane wrote: >> Robert Haas writes: >> By that argument, we need to be programming to bare metal on every disk >> access.  Does anyone want to argue that depending on vendor-specific >> filesystem functionality is not a house of cards?  

Re: [HACKERS] TCP keepalive support for libpq

2010-06-22 Thread Robert Haas
On Tue, Jun 22, 2010 at 12:50 PM, Tom Lane wrote: > Robert Haas writes: >> What does bother me is the fact that we are engineering a critical >> aspect of our system reliability around vendor-specific implementation >> details of the TCP stack, and that if any version of any operating >> system t

Re: [HACKERS] TCP keepalive support for libpq

2010-06-22 Thread Kevin Grittner
Robert Haas wrote: > What does bother me is the fact that we are engineering a critical > aspect of our system reliability around vendor-specific > implementation details of the TCP stack, and that if any version > of any operating system that we support (or ever wish to support > in the future)

Re: [HACKERS] TCP keepalive support for libpq

2010-06-22 Thread Tom Lane
Robert Haas writes: > What does bother me is the fact that we are engineering a critical > aspect of our system reliability around vendor-specific implementation > details of the TCP stack, and that if any version of any operating > system that we support (or ever wish to support in the future) fa

Re: [HACKERS] TCP keepalive support for libpq

2010-06-22 Thread Robert Haas
On Tue, Jun 22, 2010 at 12:32 PM, Magnus Hagander wrote: >> Which brings us to the question of portability.  A quick search around >> the Internet suggests that this is supported on recent versions of >> Linux, Free/OpenBSD, AIX, and HP/UX, and it appears to work on my Mac >> also.  I'm not clear

Re: [HACKERS] TCP keepalive support for libpq

2010-06-22 Thread Magnus Hagander
On Tue, Jun 22, 2010 at 18:16, Robert Haas wrote: > On Tue, Jun 22, 2010 at 9:27 AM, Magnus Hagander wrote: >>> I am inclined to punt the keepalives_interval, keepalives_idle, and >>> keepalives_count parameters to 9.1.  If these are needed for >>> walreciever to work reliably, this whole approac

Re: [HACKERS] TCP keepalive support for libpq

2010-06-22 Thread Robert Haas
On Tue, Jun 22, 2010 at 9:27 AM, Magnus Hagander wrote: >> I am inclined to punt the keepalives_interval, keepalives_idle, and >> keepalives_count parameters to 9.1.  If these are needed for >> walreciever to work reliably, this whole approach is a dead-end, >> because those parameters are not por

Re: [HACKERS] TCP keepalive support for libpq

2010-06-22 Thread Magnus Hagander
On Tue, Jun 22, 2010 at 15:20, Robert Haas wrote: > On Mon, Feb 15, 2010 at 8:58 PM, Fujii Masao wrote: >> On Tue, Feb 16, 2010 at 1:18 AM, Robert Haas wrote: I'm all for this as a 9.1 submission, but let's not commit to trying to debug it now.  I would like a green buildfarm for awhil

Re: [HACKERS] TCP keepalive support for libpq

2010-06-22 Thread Robert Haas
On Mon, Feb 15, 2010 at 8:58 PM, Fujii Masao wrote: > On Tue, Feb 16, 2010 at 1:18 AM, Robert Haas wrote: >>> I'm all for this as a 9.1 submission, but let's not commit to trying to >>> debug it now.  I would like a green buildfarm for awhile before we wrap >>> alpha4, and this sort of untested "

Re: [HACKERS] TCP keepalive support for libpq

2010-02-15 Thread Fujii Masao
On Tue, Feb 16, 2010 at 1:18 AM, Robert Haas wrote: >> I'm all for this as a 9.1 submission, but let's not commit to trying to >> debug it now.  I would like a green buildfarm for awhile before we wrap >> alpha4, and this sort of untested "it can't hurt" patch is exactly what >> is likely to make

Re: [HACKERS] TCP keepalive support for libpq

2010-02-15 Thread Robert Haas
On Mon, Feb 15, 2010 at 11:15 AM, Tom Lane wrote: > Robert Haas writes: >> On Mon, Feb 15, 2010 at 11:00 AM, Tom Lane wrote: >>> If this were actually a low-risk patch I might think it was okay to try >>> to shoehorn it in now; but IME nothing involving making new use of >>> system-dependent API

Re: [HACKERS] TCP keepalive support for libpq

2010-02-15 Thread Tom Lane
Robert Haas writes: > On Mon, Feb 15, 2010 at 11:00 AM, Tom Lane wrote: >> If this were actually a low-risk patch I might think it was okay to try >> to shoehorn it in now; but IME nothing involving making new use of >> system-dependent APIs is ever low-risk.  Look at Greg's current >> embarrassm

Re: [HACKERS] TCP keepalive support for libpq

2010-02-15 Thread Magnus Hagander
2010/2/15 Robert Haas : > On Mon, Feb 15, 2010 at 11:00 AM, Tom Lane wrote: >> Euler Taveira de Oliveira writes: >>> Magnus Hagander escreveu: If we want to do this, I'd be inclined to say we sneak this into 9.0.. It's small enough ;) >>> I'm afraid Robert will say a big NO. ;) I'm

Re: [HACKERS] TCP keepalive support for libpq

2010-02-15 Thread Robert Haas
On Mon, Feb 15, 2010 at 11:00 AM, Tom Lane wrote: > Euler Taveira de Oliveira writes: >> Magnus Hagander escreveu: >>> If we want to do this, I'd be inclined to say we sneak this into 9.0.. >>> It's small enough ;) >>> >> I'm afraid Robert will say a big NO. ;) I'm not against your idea; so if >>

Re: [HACKERS] TCP keepalive support for libpq

2010-02-15 Thread Tom Lane
Euler Taveira de Oliveira writes: > Magnus Hagander escreveu: >> If we want to do this, I'd be inclined to say we sneak this into 9.0.. >> It's small enough ;) >> > I'm afraid Robert will say a big NO. ;) I'm not against your idea; so if > nobody objects go for it *now*. If Robert doesn't I will

Re: [HACKERS] TCP keepalive support for libpq

2010-02-15 Thread Euler Taveira de Oliveira
Magnus Hagander escreveu: > If we want to do this, I'd be inclined to say we sneak this into 9.0.. > It's small enough ;) > I'm afraid Robert will say a big NO. ;) I'm not against your idea; so if nobody objects go for it *now*. -- Euler Taveira de Oliveira http://www.timbira.com/ -- Sent

Re: [HACKERS] TCP keepalive support for libpq

2010-02-15 Thread Robert Haas
On Mon, Feb 15, 2010 at 9:52 AM, Magnus Hagander wrote: > 2010/2/15 Euler Taveira de Oliveira : >> Fujii Masao escreveu: >>> Here is the patch which provides those three parameters as conninfo >>> options. Should this patch be added into the first CommitFest for v9.1? >>> >> Go ahead. > > If we wa

Re: [HACKERS] TCP keepalive support for libpq

2010-02-15 Thread Magnus Hagander
2010/2/15 Euler Taveira de Oliveira : > Fujii Masao escreveu: >> Here is the patch which provides those three parameters as conninfo >> options. Should this patch be added into the first CommitFest for v9.1? >> > Go ahead. If we want to do this, I'd be inclined to say we sneak this into 9.0.. It's

Re: [HACKERS] TCP keepalive support for libpq

2010-02-15 Thread Euler Taveira de Oliveira
Fujii Masao escreveu: > Here is the patch which provides those three parameters as conninfo > options. Should this patch be added into the first CommitFest for v9.1? > Go ahead. -- Euler Taveira de Oliveira http://www.timbira.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@post

Re: [HACKERS] TCP keepalive support for libpq

2010-02-15 Thread Fujii Masao
On Sat, Feb 13, 2010 at 2:13 AM, Euler Taveira de Oliveira wrote: > Marko Kreen escreveu: >> 3) Support all 3 parameters (keepidle, keepintvl, keepcnt) >>  and ignore parameters not supported by OS. >> > +1. AFAIR, we already do that for the backend. +1 from me, too. Here is the patch which prov

Re: [HACKERS] TCP keepalive support for libpq

2010-02-12 Thread Euler Taveira de Oliveira
Marko Kreen escreveu: > 3) Support all 3 parameters (keepidle, keepintvl, keepcnt) > and ignore parameters not supported by OS. > +1. AFAIR, we already do that for the backend. -- Euler Taveira de Oliveira http://www.timbira.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@post

Re: [HACKERS] TCP keepalive support for libpq

2010-02-12 Thread Marko Kreen
On 2/11/10, Tollef Fog Heen wrote: > | I disagree. I have clients who have problems with leftover client > connections > | due to server host failures. They do not write apps in C. For a non-default > | change to be effective we would need to have all the client drivers, eg > JDBC, > | psyco

Re: [HACKERS] TCP keepalive support for libpq

2010-02-12 Thread Fujii Masao
On Fri, Feb 12, 2010 at 6:40 PM, Peter Geoghegan wrote: >> keepalive doesn't work, >> then it would have to wait for a long time until it detects >> the outage. > > I'm not really sure what you mean. In this scenario, would it take as > long as it would have taken had keepalives not been used? Pl

Re: [HACKERS] TCP keepalive support for libpq

2010-02-12 Thread Peter Geoghegan
> wrote: >> Why hasn't libpq had keepalives for years? > > I guess that it's because keepalive doesn't work as expected > in some cases. For example, if the network outage happens > before a client sends some packets, keepalive doesn't work, > then it would have to wait for a long time until it de

Re: [HACKERS] TCP keepalive support for libpq

2010-02-11 Thread Fujii Masao
On Fri, Feb 12, 2010 at 1:33 AM, Peter Geoghegan wrote: > Why hasn't libpq had keepalives for years? I guess that it's because keepalive doesn't work as expected in some cases. For example, if the network outage happens before a client sends some packets, keepalive doesn't work, then it would hav

Re: [HACKERS] TCP keepalive support for libpq

2010-02-11 Thread Tollef Fog Heen
]] Robert Haas | I've sometimes wondered why keepalives aren't the default for all TCP | connections. They seem like they're usually a Good Thing (TM), but I | wonder if we can think of any situations where someone might not want | them? As somebody mentioned somewhere else (I think): If you pa

Re: [HACKERS] TCP keepalive support for libpq

2010-02-11 Thread Kris Jurka
On Thu, 11 Feb 2010, Andrew Chernow wrote: Although, I think Dave's comments have made me change my mind about this patch. Looks like it serves a good purpose. That said, there is no guarentee the driver will implement the new feature ... JDBC seems to lack the ability to get the backing

Re: [HACKERS] TCP keepalive support for libpq

2010-02-11 Thread Peter Geoghegan
Also, more importantly (from http://www.slony.info/documentation/slonyadmin.html): "A WAN outage (or flakiness of the WAN in general) can leave database connections "zombied", and typical TCP/IP behaviour will allow those connections to persist, preventing a slon restart for around two hours. "

Re: [HACKERS] TCP keepalive support for libpq

2010-02-11 Thread Dimitri Fontaine
"Kevin Grittner" writes: > those people who create 2000 lightly used connections to the > database might feel differently. Yeah I still run against installation using the infamous PHP pconnect() function. You certainly don't want to add some load there, but that could urge them into arranging for

Re: [HACKERS] TCP keepalive support for libpq

2010-02-11 Thread Andrew Chernow
Robert Haas wrote: On Thu, Feb 11, 2010 at 2:15 AM, Tollef Fog Heen wrote: ]] daveg | I disagree. I have clients who have problems with leftover client connections | due to server host failures. They do not write apps in C. For a non-default | change to be effective we would need to have all t

Re: [HACKERS] TCP keepalive support for libpq

2010-02-11 Thread Peter Geoghegan
>From the Slony-I docs (http://www.slony.info/documentation/faq.html) : "Supposing you experience some sort of network outage, the connection between slon and database may fail, and the slon may figure this out long before the PostgreSQL instance it was connected to does. The result is that there

Re: [HACKERS] TCP keepalive support for libpq

2010-02-11 Thread Kevin Grittner
Robert Haas wrote: > I've sometimes wondered why keepalives aren't the default for all > TCP connections. They seem like they're usually a Good Thing > (TM), but I wonder if we can think of any situations where someone > might not want them? I think it's insane not to use them at all, but the

Re: [HACKERS] TCP keepalive support for libpq

2010-02-11 Thread Robert Haas
On Thu, Feb 11, 2010 at 2:15 AM, Tollef Fog Heen wrote: > ]] daveg > > | I disagree. I have clients who have problems with leftover client > connections > | due to server host failures. They do not write apps in C. For a non-default > | change to be effective we would need to have all the client

Re: [HACKERS] TCP keepalive support for libpq

2010-02-11 Thread Tollef Fog Heen
]] daveg | I disagree. I have clients who have problems with leftover client connections | due to server host failures. They do not write apps in C. For a non-default | change to be effective we would need to have all the client drivers, eg JDBC, | psycopg, DBD-DBI, and the apps like psql make ch

Re: [HACKERS] TCP keepalive support for libpq

2010-02-11 Thread Andrew Chernow
ISTM that the default behavior should be keep alives disabled, as it is now, and those wanting it can just set it in their apps: setsockopt(PQsocket(conn), SOL_SOCKET, SO_KEEPALIVE, ...) I disagree. I have clients who have problems with leftover client connections due to server host failures.

Re: [HACKERS] TCP keepalive support for libpq

2010-02-11 Thread Magnus Hagander
2010/2/10 daveg : > On Tue, Feb 09, 2010 at 09:34:10AM -0500, Andrew Chernow wrote: >> Tollef Fog Heen wrote: >> >(please Cc me on replies, I am not subscribed) >> > >> >Hi, >> > >> >libpq currently does not use TCP keepalives.  This is a problem in our >> >case where we have some clients waiting f

Re: [HACKERS] TCP keepalive support for libpq

2010-02-10 Thread daveg
On Tue, Feb 09, 2010 at 09:34:10AM -0500, Andrew Chernow wrote: > Tollef Fog Heen wrote: > >(please Cc me on replies, I am not subscribed) > > > >Hi, > > > >libpq currently does not use TCP keepalives. This is a problem in our > >case where we have some clients waiting for notifies and then the >

Re: [HACKERS] TCP keepalive support for libpq

2010-02-09 Thread Fujii Masao
On Tue, Feb 9, 2010 at 11:34 PM, Andrew Chernow wrote: > If you really want libpq to manage this, I think you need to expose the > probe interval and timeouts. Agreed. Previously I was making the patch that exposes them as conninfo options so that the standby can detect a network outage ASAP in

Re: [HACKERS] TCP keepalive support for libpq

2010-02-09 Thread Andrew Chernow
Tollef Fog Heen wrote: (please Cc me on replies, I am not subscribed) Hi, libpq currently does not use TCP keepalives. This is a problem in our case where we have some clients waiting for notifies and then the connection is dropped on the server side. The client never gets the FIN and thinks

Re: [HACKERS] TCP keepalive support for libpq

2010-02-09 Thread Tollef Fog Heen
]] Magnus Hagander | Seems reasonable to add this. Are there any scenarios where this can | cause trouble, that would be fixed by having the ability to select | non-standard behavior? Well, it might be unwanted if you're on a pay-per-bit connection such as 3G, but in this case, it just makes the

Re: [HACKERS] TCP keepalive support for libpq

2010-02-09 Thread Magnus Hagander
On Tue, Feb 9, 2010 at 14:03, Tollef Fog Heen wrote: > > (please Cc me on replies, I am not subscribed) > > Hi, > > libpq currently does not use TCP keepalives.  This is a problem in our > case where we have some clients waiting for notifies and then the > connection is dropped on the server side.

[HACKERS] TCP keepalive support for libpq

2010-02-09 Thread Tollef Fog Heen
(please Cc me on replies, I am not subscribed) Hi, libpq currently does not use TCP keepalives. This is a problem in our case where we have some clients waiting for notifies and then the connection is dropped on the server side. The client never gets the FIN and thinks the connection is up. T