Re: [HACKERS] beta3 the open items list

2010-06-28 Thread Thom Brown
On 19 June 2010 14:43, Robert Haas robertmh...@gmail.com wrote: It would be nice if we could make a final push to get these issues resolved and another beta out the door before the end of the month... So should we expect beta3 imminently, or are these issues still outstanding? Thanks Thom --

Re: [HACKERS] beta3 the open items list

2010-06-21 Thread Greg Stark
On Mon, Jun 21, 2010 at 4:54 AM, Robert Haas robertmh...@gmail.com wrote: I feel like we're getting off in the weeds, here.  Obviously, the user would ideally like the connection to the master to last forever, but equally obviously, if the master unexpectedly reboots, they'd like the slave to

Re: [HACKERS] beta3 the open items list

2010-06-21 Thread Robert Haas
On Mon, Jun 21, 2010 at 4:37 AM, Greg Stark gsst...@mit.edu wrote: On Mon, Jun 21, 2010 at 4:54 AM, Robert Haas robertmh...@gmail.com wrote: I feel like we're getting off in the weeds, here.  Obviously, the user would ideally like the connection to the master to last forever, but equally

Re: [HACKERS] beta3 the open items list

2010-06-21 Thread Robert Haas
On Sun, Jun 20, 2010 at 5:52 PM, Tom Lane t...@sss.pgh.pa.us wrote: On a quick read, I think I see a problem with this: if a parameter is specified with a non-zero value and there is no OS support available for that parameter, it's an error.  Presumably, for our purposes here, we'd prefer to

Re: [HACKERS] beta3 the open items list

2010-06-20 Thread Andres Freund
On Saturday 19 June 2010 18:05:34 Joshua D. Drake wrote: On Sat, 2010-06-19 at 09:43 -0400, Robert Haas wrote: 4. Streaming Replication needs to detect death of master. We need some sort of keep-alive, here. Whether it's at the TCP level (as advocated by Tom Lane and others) or at the

Re: [HACKERS] beta3 the open items list

2010-06-20 Thread Florian Pflug
On Jun 20, 2010, at 7:18 , Tom Lane wrote: Florian Pflug f...@phlo.org writes: On Jun 19, 2010, at 21:13 , Tom Lane wrote: This is nonsense --- the slave's kernel *will* eventually notice that the TCP connection is dead, and tell walreceiver so. I don't doubt that the standard TCP timeout is

Re: [HACKERS] beta3 the open items list

2010-06-20 Thread Kevin Grittner
Florian Pflug wrote: On Jun 20, 2010, at 7:18 , Tom Lane wrote: I was thinking that keepalive was on by default with a very long interval, but I see this isn't so. However, if we enable keepalive, then it's irrelevant to the point anyway. Nobody's produced any evidence that keepalive is an

Re: [HACKERS] beta3 the open items list

2010-06-20 Thread Tom Lane
Kevin Grittner kevin.gritt...@wicourts.gov writes: Florian Pflug wrote: Yeah, I agree. Just enabling keepalive should suffice for 9.0. +1, with configurable timeout; Right, of course. That's already in the pending patch isn't it? regards, tom lane -- Sent via

Re: [HACKERS] beta3 the open items list

2010-06-20 Thread Joshua D. Drake
On Sun, 2010-06-20 at 11:36 -0400, Tom Lane wrote: Kevin Grittner kevin.gritt...@wicourts.gov writes: Florian Pflug wrote: Yeah, I agree. Just enabling keepalive should suffice for 9.0. +1, with configurable timeout; Right, of course. That's already in the pending patch isn't it?

Re: [HACKERS] beta3 the open items list

2010-06-20 Thread Kevin Grittner
Joshua D. Drake wrote: Can someone tell me what we are going to do about firewalls that impose their own rules outside of the control of the DBA? Has anyone actually seen a firewall configured for something so stupid as to allow *almost* all the various packets involved in using a TCP

Re: [HACKERS] beta3 the open items list

2010-06-20 Thread Kenneth Marshall
On Sun, Jun 20, 2010 at 03:01:04PM -0500, Kevin Grittner wrote: Joshua D. Drake wrote: Can someone tell me what we are going to do about firewalls that impose their own rules outside of the control of the DBA? Has anyone actually seen a firewall configured for something so stupid as

Re: [HACKERS] beta3 the open items list

2010-06-20 Thread Robert Haas
On Sun, Jun 20, 2010 at 11:36 AM, Tom Lane t...@sss.pgh.pa.us wrote: Kevin Grittner kevin.gritt...@wicourts.gov writes: Florian Pflug  wrote: Yeah, I agree. Just enabling keepalive should suffice for 9.0. +1, with configurable timeout; Right, of course.  That's already in the pending patch

Re: [HACKERS] beta3 the open items list

2010-06-20 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes: On Sun, Jun 20, 2010 at 11:36 AM, Tom Lane t...@sss.pgh.pa.us wrote: Right, of course.  That's already in the pending patch isn't it? Is this sarcasm, or is there a pending patch I'm not aware of?

Re: [HACKERS] beta3 the open items list

2010-06-20 Thread Florian Pflug
On Jun 20, 2010, at 22:01 , Kevin Grittner wrote: Joshua D. Drake wrote: Can someone tell me what we are going to do about firewalls that impose their own rules outside of the control of the DBA? Has anyone actually seen a firewall configured for something so stupid as to allow *almost*

Re: [HACKERS] beta3 the open items list

2010-06-20 Thread Robert Haas
On Sun, Jun 20, 2010 at 5:32 PM, Tom Lane t...@sss.pgh.pa.us wrote: Robert Haas robertmh...@gmail.com writes: On Sun, Jun 20, 2010 at 11:36 AM, Tom Lane t...@sss.pgh.pa.us wrote: Right, of course.  That's already in the pending patch isn't it? Is this sarcasm, or is there a pending patch I'm

Re: [HACKERS] beta3 the open items list

2010-06-20 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes: On Sun, Jun 20, 2010 at 5:32 PM, Tom Lane t...@sss.pgh.pa.us wrote: https://commitfest.postgresql.org/action/patch_view?id=281 +1 for applying something along these lines, but we'll also need to update walreceiver to actually use one or more of these

Re: [HACKERS] beta3 the open items list

2010-06-20 Thread Greg Stark
On Sun, Jun 20, 2010 at 10:41 PM, Florian Pflug f...@phlo.org wrote: Yeah, especially since there is no such thing as a special keepalive packet in TCP. Keepalive simply sends packets with zero bytes of payload every once in a while if the connection is otherwise inactive. If those aren't

Re: [HACKERS] beta3 the open items list

2010-06-20 Thread Kevin Grittner
Greg Stark wrote: Keepalives introduce spurious disconnections in working TCP connections that have transient outages It's been a while since I read up on this, so perhaps my memory has distorted the facts over time, but I thought that under TCP, if one side sends a packet which isn't ack'd

Re: [HACKERS] beta3 the open items list

2010-06-20 Thread Florian Pflug
On Jun 21, 2010, at 0:13 , Greg Stark wrote: Keepalive is therefore extremely unlikely to break things - in the very worst case, a (really, really stupid) firewall might decide to drop packets with zero bytes of payload, causing inactive connections to abort after a while. AFAIK walreceiver

Re: [HACKERS] beta3 the open items list

2010-06-20 Thread Greg Stark
On Mon, Jun 21, 2010 at 12:42 AM, Florian Pflug f...@phlo.org wrote: I'd buy that if all timeouts and retry counts would default to +infinity. But they don't, and hence sufficiently long network outages *will* cause connection aborts anyway. That a particular connection might survive due to

Re: [HACKERS] beta3 the open items list

2010-06-20 Thread Robert Haas
On Sun, Jun 20, 2010 at 9:31 PM, Greg Stark gsst...@mit.edu wrote: On Mon, Jun 21, 2010 at 12:42 AM, Florian Pflug f...@phlo.org wrote: I'd buy that if all timeouts and retry counts would default to +infinity. But they don't, and hence sufficiently long network outages *will* cause

[HACKERS] beta3 the open items list

2010-06-19 Thread Robert Haas
It would be nice to get beta3 out the door sooner rather than later, but I sort of feel like we're not ready yet. In fact, we seem to be a bit stalled. The open items list currently lists four items. 1. max_standby_delay. Tom has committed to getting this done, but has been tied up with

Re: [HACKERS] beta3 the open items list

2010-06-19 Thread Joshua D. Drake
On Sat, 2010-06-19 at 09:43 -0400, Robert Haas wrote: 4. Streaming Replication needs to detect death of master. We need some sort of keep-alive, here. Whether it's at the TCP level (as advocated by Tom Lane and others) or at the protocol level (as advocated by Greg Stark) is something that

Re: [HACKERS] beta3 the open items list

2010-06-19 Thread Greg Stark
On Sat, Jun 19, 2010 at 2:43 PM, Robert Haas robertmh...@gmail.com wrote: 4. Streaming Replication needs to detect death of master.  We need some sort of keep-alive, here.  Whether it's at the TCP level (as advocated by Tom Lane and others) or at the protocol level (as advocated by Greg Stark)

Re: [HACKERS] beta3 the open items list

2010-06-19 Thread Robert Haas
On Sat, Jun 19, 2010 at 2:46 PM, Greg Stark gsst...@mit.edu wrote: On Sat, Jun 19, 2010 at 2:43 PM, Robert Haas robertmh...@gmail.com wrote: 4. Streaming Replication needs to detect death of master.  We need some sort of keep-alive, here.  Whether it's at the TCP level (as advocated by Tom

Re: [HACKERS] beta3 the open items list

2010-06-19 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes: Right now, if the SR master reboots unexpectedly (say, power plug pull and restart), the slave never notices. It just sits there forever waiting for the next byte of data from the master to arrive (which it never will). This is nonsense --- the

Re: [HACKERS] beta3 the open items list

2010-06-19 Thread Andres Freund
On Saturday 19 June 2010 18:05:34 Joshua D. Drake wrote: On Sat, 2010-06-19 at 09:43 -0400, Robert Haas wrote: 4. Streaming Replication needs to detect death of master. We need some sort of keep-alive, here. Whether it's at the TCP level (as advocated by Tom Lane and others) or at the

Re: [HACKERS] beta3 the open items list

2010-06-19 Thread Stefan Kaltenbrunner
On 06/19/2010 09:13 PM, Tom Lane wrote: Robert Haasrobertmh...@gmail.com writes: Right now, if the SR master reboots unexpectedly (say, power plug pull and restart), the slave never notices. It just sits there forever waiting for the next byte of data from the master to arrive (which it never

Re: [HACKERS] beta3 the open items list

2010-06-19 Thread Florian Pflug
On Jun 19, 2010, at 21:13 , Tom Lane wrote: Robert Haas robertmh...@gmail.com writes: Right now, if the SR master reboots unexpectedly (say, power plug pull and restart), the slave never notices. It just sits there forever waiting for the next byte of data from the master to arrive (which it

Re: [HACKERS] beta3 the open items list

2010-06-19 Thread Simon Riggs
On Sat, 2010-06-19 at 14:53 -0400, Robert Haas wrote: On Sat, Jun 19, 2010 at 2:46 PM, Greg Stark gsst...@mit.edu wrote: On Sat, Jun 19, 2010 at 2:43 PM, Robert Haas robertmh...@gmail.com wrote: 4. Streaming Replication needs to detect death of master. We need some sort of keep-alive,

Re: [HACKERS] beta3 the open items list

2010-06-19 Thread Tom Lane
Florian Pflug f...@phlo.org writes: On Jun 19, 2010, at 21:13 , Tom Lane wrote: This is nonsense --- the slave's kernel *will* eventually notice that the TCP connection is dead, and tell walreceiver so. I don't doubt that the standard TCP timeout is longer than people want to wait for that,