Re: [HACKERS] Issues with Quorum Commit

2010-10-06 Thread Fujii Masao
On Wed, Oct 6, 2010 at 3:31 PM, Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: No. Synchronous replication does not help with availability. It allows you to achieve zero data loss, ie. if the master dies, you are guaranteed that any transaction that was acknowledged as committed,

Re: [HACKERS] Issues with Quorum Commit

2010-10-06 Thread Heikki Linnakangas
On 06.10.2010 11:09, Fujii Masao wrote: On Wed, Oct 6, 2010 at 3:31 PM, Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: No. Synchronous replication does not help with availability. It allows you to achieve zero data loss, ie. if the master dies, you are guaranteed that any

Re: [HACKERS] Issues with Quorum Commit

2010-10-06 Thread Markus Wanner
On 10/06/2010 10:17 AM, Heikki Linnakangas wrote: On 06.10.2010 11:09, Fujii Masao wrote: Hmm.. but we can increase availability without any data loss by using synchronous replication. Many people have already been using synchronous replication softwares such as DRBD for that purpose.

Re: [HACKERS] Issues with Quorum Commit

2010-10-06 Thread Fujii Masao
On Wed, Oct 6, 2010 at 5:17 PM, Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: On 06.10.2010 11:09, Fujii Masao wrote: On Wed, Oct 6, 2010 at 3:31 PM, Heikki Linnakangas heikki.linnakan...@enterprisedb.com  wrote: No. Synchronous replication does not help with availability. It

Re: [HACKERS] Issues with Quorum Commit

2010-10-06 Thread Heikki Linnakangas
On 06.10.2010 11:39, Markus Wanner wrote: On 10/06/2010 10:17 AM, Heikki Linnakangas wrote: On 06.10.2010 11:09, Fujii Masao wrote: Hmm.. but we can increase availability without any data loss by using synchronous replication. Many people have already been using synchronous replication

Re: [HACKERS] Issues with Quorum Commit

2010-10-06 Thread Heikki Linnakangas
On 06.10.2010 11:49, Fujii Masao wrote: On Wed, Oct 6, 2010 at 5:17 PM, Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: Sure, but it's not the synchronous aspect that increases availability. It's the replication aspect, and we already have that. Making the replication synchronous

Re: [HACKERS] Issues with Quorum Commit

2010-10-06 Thread Markus Wanner
On 10/06/2010 10:53 AM, Heikki Linnakangas wrote: Wow, that is really short. Are you sure? I have no first hand experience with DRBD, Neither do I. and reading that man page, I get the impression that the timeout us just for deciding that the TCP connection is dead. There is also the

Re: [HACKERS] Issues with Quorum Commit

2010-10-06 Thread Magnus Hagander
On Wed, Oct 6, 2010 at 10:17, Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote: On 06.10.2010 11:09, Fujii Masao wrote: On Wed, Oct 6, 2010 at 3:31 PM, Heikki Linnakangas heikki.linnakan...@enterprisedb.com  wrote: No. Synchronous replication does not help with availability. It

Re: [HACKERS] Issues with Quorum Commit

2010-10-06 Thread Heikki Linnakangas
On 06.10.2010 13:41, Magnus Hagander wrote: That's only for a narrow definition of availability. For a lot of people, having access to your data isn't considered availability unless you can trust the data... Ok, fair enough. For that, synchronous replication in the wait forever mode is the

Re: [HACKERS] Issues with Quorum Commit

2010-10-06 Thread Dimitri Fontaine
Markus Wanner mar...@bluegap.ch writes: On 10/06/2010 04:31 AM, Simon Riggs wrote: That situation would require two things * First, you have set up async replication and you're not monitoring it properly. Shame on you. The way I read it, Jeff is complaining about the timeout you propose

Re: [HACKERS] Issues with Quorum Commit

2010-10-06 Thread Heikki Linnakangas
On 06.10.2010 15:22, Dimitri Fontaine wrote: What is necessary here is a clear view on the possible states that a standby can be in at any time, and we must stop trying to apply to some non-ready standby the behavior we want when it's already in-sync. From my experience operating londiste,

Re: [HACKERS] Issues with Quorum Commit

2010-10-06 Thread Simon Riggs
On Wed, 2010-10-06 at 15:26 +0300, Heikki Linnakangas wrote: You're not going to get zero data loss that way. Ending the wait state does not cause data loss. It puts you at *risk* of data loss, which is a different thing entirely. If you want to avoid data loss you use N+k redundancy and get

Re: [HACKERS] Issues with Quorum Commit

2010-10-06 Thread Dimitri Fontaine
Heikki Linnakangas heikki.linnakan...@enterprisedb.com writes: 1. base-backup — self explaining 2. catch-up — getting the WAL to catch up after base backup 3. wanna-sync — don't yet have all the WAL to get in sync 4. do-sync — all WALs are there, coming soon 5. ok (async

Re: [HACKERS] Issues with Quorum Commit

2010-10-06 Thread Heikki Linnakangas
On 06.10.2010 17:20, Simon Riggs wrote: On Wed, 2010-10-06 at 15:26 +0300, Heikki Linnakangas wrote: You're not going to get zero data loss that way. Ending the wait state does not cause data loss. It puts you at *risk* of data loss, which is a different thing entirely. Looking at it that

Re: [HACKERS] Issues with Quorum Commit

2010-10-06 Thread Heikki Linnakangas
On 06.10.2010 18:02, Dimitri Fontaine wrote: Heikki Linnakangasheikki.linnakan...@enterprisedb.com writes: 1. base-backup — self explaining 2. catch-up — getting the WAL to catch up after base backup 3. wanna-sync — don't yet have all the WAL to get in sync 4. do-sync —

Re: [HACKERS] Issues with Quorum Commit

2010-10-06 Thread Markus Wanner
On 10/06/2010 04:20 PM, Simon Riggs wrote: Ending the wait state does not cause data loss. It puts you at *risk* of data loss, which is a different thing entirely. These kind of risk scenarios is what sync replication is all about. A minimum guarantee that doesn't hold in face of the first few

Re: [HACKERS] Issues with Quorum Commit

2010-10-06 Thread Dimitri Fontaine
Heikki Linnakangas heikki.linnakan...@enterprisedb.com writes: I'm sorry, but I still don't understand the use case you're envisioning. How many standbys are there? What are you trying to achieve with synchronous replication over what asynchronous offers? Sorry if I've been unclear, I read

Re: [HACKERS] Issues with Quorum Commit

2010-10-06 Thread Josh Berkus
All, Let me clarify and consolidate this discussion. Again, it's my goal that this thread specifically identify only the problems and desired behaviors for synch rep with more than one sync standby. There are several issues with even one sync standby which still remain unresolved, but I believe

Re: [HACKERS] Issues with Quorum Commit

2010-10-06 Thread Markus Wanner
Hello Dimitri, On 10/06/2010 05:41 PM, Dimitri Fontaine wrote: - when do you start considering the standby as a candidate to your sync rep requirements? That question doesn't make much sense to me. There's no point in time I ever mind if a standby is a candidate or not. Either I want to

Re: [HACKERS] Issues with Quorum Commit

2010-10-06 Thread Dimitri Fontaine
Markus Wanner mar...@bluegap.ch writes: There's no point in time I ever mind if a standby is a candidate or not. Either I want to synchronously replicate to X standbies, or not. Ok so I think we're agreeing here: what I said amounts to propose that the code does work this way when the quorum

Re: [HACKERS] Issues with Quorum Commit

2010-10-06 Thread Heikki Linnakangas
On 06.10.2010 20:57, Josh Berkus wrote: While it's nice to dismiss case (1) as an edge-case, consider the likelyhood of someone running PostgreSQL with fsync=off on cloud hosting. In that case, having k = N = 5 does not seem like an unreasonable arrangement if you want to ensure durability via

Re: [HACKERS] Issues with Quorum Commit

2010-10-06 Thread Markus Wanner
On 10/06/2010 09:04 PM, Dimitri Fontaine wrote: Ok so I think we're agreeing here: what I said amounts to propose that the code does work this way when the quorum is such setup, and/or is able to reject any non-read-only transaction (those that needs a real XID) until your standby is fully in

Re: [HACKERS] Issues with Quorum Commit

2010-10-06 Thread Josh Berkus
Seems reasonable, but what is a CAP database? Database based around the CAP theorem[1]. Cassandra, Dynamo, Hypertable, etc. For us, the equation is: CAD, as in Consistency, Availability, Durability. Pick any two, at best. But it's a very similar bag of issues as the ones CAP addresses.

Re: [HACKERS] Issues with Quorum Commit

2010-10-06 Thread Simon Riggs
On Wed, 2010-10-06 at 18:04 +0300, Heikki Linnakangas wrote: The key is whether you are guaranteed to have zero data loss or not. We agree that is an important question. You seem willing to trade anything for that guarantee. I seek a more pragmatic approach that balances availability and risk.

Re: [HACKERS] Issues with Quorum Commit

2010-10-05 Thread Heikki Linnakangas
On 05.10.2010 22:11, Josh Berkus wrote: There's been a lot of discussion on synch rep lately which involves quorum commit. I need to raise some major design issues with quorum commit which I don't think that people have really considered, and may be sufficient to prevent it from being included

Re: [HACKERS] Issues with Quorum Commit

2010-10-05 Thread Josh Berkus
Heikki, The master can not roll back or cancel the transaction. That's completely infeasible, the WAL record has been written to local disk already. The best it can do is halt and wait for enough standbys to appear to fulfill the quorum. The client will hang waiting for the COMMIT to finish,

Re: [HACKERS] Issues with Quorum Commit

2010-10-05 Thread Jeff Davis
On Tue, 2010-10-05 at 12:11 -0700, Josh Berkus wrote: B. Eventual Inconsistency - If we have a quorum commit, it's possible for any individual standby to be indefinitely ahead of any standby which is not needed by the quorum. This means that: -- There is no clear

Re: [HACKERS] Issues with Quorum Commit

2010-10-05 Thread Simon Riggs
On Tue, 2010-10-05 at 22:32 +0300, Heikki Linnakangas wrote: On 05.10.2010 22:11, Josh Berkus wrote: There's been a lot of discussion on synch rep lately which involves quorum commit. I need to raise some major design issues with quorum commit which I don't think that people have really

Re: [HACKERS] Issues with Quorum Commit

2010-10-05 Thread Simon Riggs
On Tue, 2010-10-05 at 13:45 -0700, Jeff Davis wrote: On Tue, 2010-10-05 at 12:11 -0700, Josh Berkus wrote: B. Eventual Inconsistency - If we have a quorum commit, it's possible for any individual standby to be indefinitely ahead of any standby which is not needed

Re: [HACKERS] Issues with Quorum Commit

2010-10-05 Thread Robert Haas
On Tue, Oct 5, 2010 at 5:10 PM, Simon Riggs si...@2ndquadrant.com wrote: The points appear to be directed at quorum commit, which is a name I've used. But most of the points apply more to Fujii's patch than my own. I can only presume that Josh wants to prevent us from adopting a design that

Re: [HACKERS] Issues with Quorum Commit

2010-10-05 Thread Simon Riggs
On Tue, 2010-10-05 at 13:43 -0700, Josh Berkus wrote: Again, I'm just saying that merely doing single-server synch rep, *and* making HS/SR easier to admin in general, is going to be a big task for 9.1. Quorum Commit needs to be considered a separate feature, and one which is dispensible for

Re: [HACKERS] Issues with Quorum Commit

2010-10-05 Thread Simon Riggs
On Tue, 2010-10-05 at 17:21 -0400, Robert Haas wrote: On Tue, Oct 5, 2010 at 5:10 PM, Simon Riggs si...@2ndquadrant.com wrote: The points appear to be directed at quorum commit, which is a name I've used. But most of the points apply more to Fujii's patch than my own. I can only presume

Re: [HACKERS] Issues with Quorum Commit

2010-10-05 Thread Josh Berkus
Simon, Robert, The points appear to be directed at quorum commit, which is a name I've used. But most of the points apply more to Fujii's patch than my own. Per previous discussion, I'm trying to get at what reasonable behavior is, rather than targeting one patch or the other. I can only

Re: [HACKERS] Issues with Quorum Commit

2010-10-05 Thread Simon Riggs
On Tue, 2010-10-05 at 15:14 -0700, Josh Berkus wrote: I can only presume that Josh wants to prevent us from adopting a design that allows sync against multiple standbys. Quorum commit == X servers need to ack for commit, where X 1. Usually done as X out of Y servers must ack, but it's

Re: [HACKERS] Issues with Quorum Commit

2010-10-05 Thread Josh Berkus
Heikki had argued that a use case existed where Y out of Y (i.e. all) nodes must acknowledge before we commit. That was the use case that required us to have standby registration. It was optional in all other cases. Yeah, Y of Y is just a special case of X of Y. And, IMHO, rather pointless

Re: [HACKERS] Issues with Quorum Commit

2010-10-05 Thread Jeff Davis
On Tue, 2010-10-05 at 22:19 +0100, Simon Riggs wrote: In other words, a lagging standby combined with a timeout mechanism is essentially useless, because it will never catch up in time to be a part of the quorum. Thanks for explaining what was meant. This issue is a serious problem

Re: [HACKERS] Issues with Quorum Commit

2010-10-05 Thread Simon Riggs
On Tue, 2010-10-05 at 18:52 -0700, Jeff Davis wrote: I'm not saying that an unavailable system is good, but I don't see how my particular complaint applies to the wait for all servers to apply case. The case I was worried about is: * 1 master and 2 standby * The rule is wait for at least

<    1   2