Re: [HACKERS] Issues with Quorum Commit

2010-10-20 Thread Bruce Momjian
Tom Lane wrote: > Greg Smith writes: > > I don't see this as needing any implementation any more complicated than > > the usual way such timeouts are handled. Note how long you've been > > trying to reach the standby. Default to -1 for forever. And if you hit > > the timeout, mark the standb

Re: [HACKERS] Issues with Quorum Commit

2010-10-14 Thread Robert Haas
On Wed, Oct 13, 2010 at 10:18 PM, Greg Stark wrote: > On Tue, Oct 12, 2010 at 11:50 PM, Robert Haas wrote: >> There's another problem here we should think about, too.  Suppose you >> have a master and two standbys.  The master dies.  You promote one of >> the standbys, which turns out to be behin

Re: [HACKERS] Issues with Quorum Commit

2010-10-14 Thread Fujii Masao
On Thu, Oct 14, 2010 at 11:18 AM, Greg Stark wrote: > Why don't the usual protections kick in here? The new record read from > the location the xlog reader is expecting to find it has to have a > valid CRC and a correct back pointer to the previous record. Yep. In most cases, those protections se

Re: [HACKERS] Issues with Quorum Commit

2010-10-14 Thread Robert Haas
On Wed, Oct 13, 2010 at 5:22 AM, Fujii Masao wrote: > On Wed, Oct 13, 2010 at 3:50 PM, Robert Haas wrote: >> There's another problem here we should think about, too.  Suppose you >> have a master and two standbys.  The master dies.  You promote one of >> the standbys, which turns out to be behind

Re: [HACKERS] Issues with Quorum Commit

2010-10-14 Thread Greg Stark
On Tue, Oct 12, 2010 at 11:50 PM, Robert Haas wrote: > There's another problem here we should think about, too.  Suppose you > have a master and two standbys.  The master dies.  You promote one of > the standbys, which turns out to be behind the other.  You then > repoint the other standby at the

Re: [HACKERS] Issues with Quorum Commit

2010-10-13 Thread Fujii Masao
On Wed, Oct 13, 2010 at 3:50 PM, Robert Haas wrote: > There's another problem here we should think about, too.  Suppose you > have a master and two standbys.  The master dies.  You promote one of > the standbys, which turns out to be behind the other.  You then > repoint the other standby at the o

Re: [HACKERS] Issues with Quorum Commit

2010-10-13 Thread Fujii Masao
On Wed, Oct 13, 2010 at 3:43 PM, Heikki Linnakangas wrote: > On 13.10.2010 08:21, Fujii Masao wrote: >> >> On Sat, Oct 9, 2010 at 4:31 AM, Heikki Linnakangas >>  wrote: >>> >>> It shouldn't be too hard to fix. Walsender needs to be able to read WAL >>> from >>> preceding timelines, like recovery

Re: [HACKERS] Issues with Quorum Commit

2010-10-13 Thread Markus Wanner
On 10/13/2010 06:43 AM, Fujii Masao wrote: > Unfortunately even enough standbys don't increase write-availability > unless you choose wait-forever. Because, after promoting one of > standbys to new master, you must keep all the transactions waiting > until at least one standby has connected to and

Re: [HACKERS] Issues with Quorum Commit

2010-10-12 Thread Robert Haas
On Wed, Oct 13, 2010 at 2:43 AM, Heikki Linnakangas wrote: > On 13.10.2010 08:21, Fujii Masao wrote: >> >> On Sat, Oct 9, 2010 at 4:31 AM, Heikki Linnakangas >>  wrote: >>> >>> It shouldn't be too hard to fix. Walsender needs to be able to read WAL >>> from >>> preceding timelines, like recovery

Re: [HACKERS] Issues with Quorum Commit

2010-10-12 Thread Heikki Linnakangas
On 13.10.2010 08:21, Fujii Masao wrote: On Sat, Oct 9, 2010 at 4:31 AM, Heikki Linnakangas wrote: It shouldn't be too hard to fix. Walsender needs to be able to read WAL from preceding timelines, like recovery does, and walreceiver needs to write the incoming WAL to the right file. And walse

Re: [HACKERS] Issues with Quorum Commit

2010-10-12 Thread Fujii Masao
On Sat, Oct 9, 2010 at 4:31 AM, Heikki Linnakangas wrote: >> Yes. But if there is no unsent WAL when the master goes down, >> we can start new standby without new backup by copying the >> timeline history file from new master to new standby and >> setting recovery_target_timeline to 'latest'. > >

Re: [HACKERS] Issues with Quorum Commit

2010-10-12 Thread Fujii Masao
On Sat, Oct 9, 2010 at 1:41 AM, Josh Berkus wrote: > >> And, I'd like to know whether the master waits forever because of the >> standby failure in other solutions such as Oracle DataGuard, MySQL >> semi-synchronous replication. > > MySQL used to be fond of simiply failing sliently.  Not sure what

Re: [HACKERS] Issues with Quorum Commit

2010-10-12 Thread Fujii Masao
On Sat, Oct 9, 2010 at 12:12 AM, Markus Wanner wrote: > On 10/08/2010 04:48 PM, Fujii Masao wrote: >> I believe many systems require write-availability. > > Sure. Make sure you have enough standbies to fail over to. Unfortunately even enough standbys don't increase write-availability unless you c

Re: [HACKERS] Issues with Quorum Commit

2010-10-11 Thread Markus Wanner
Greg, to me it looks like we have very similar goals, but start from different preconditions. I absolutely agree with you given the preconditions you named. On 10/08/2010 10:04 PM, Greg Smith wrote: > How is that a new problem? It's already possible to end up with a > standby pair that has suffe

Re: [HACKERS] Issues with Quorum Commit

2010-10-08 Thread Simon Riggs
On Fri, 2010-10-08 at 16:34 -0400, Greg Smith wrote: > Tom Lane wrote: > > How are you going to "mark the standby as degraded"? The > > standby can't keep that information, because it's not even connected > > when the master makes the decision. > > From a high level, I'm assuming only that the m

Re: [HACKERS] Issues with Quorum Commit

2010-10-08 Thread Simon Riggs
On Fri, 2010-10-08 at 17:06 +0200, Markus Wanner wrote: > Well, full cluster outages are infrequent, but sadly cannot be avoided > entirely. (Murphy's laughing). IMO we should be prepared to deal with > those. I've described how I propose to deal with those. I'm not waving away these issues, just

Re: [HACKERS] Issues with Quorum Commit

2010-10-08 Thread Greg Smith
Tom Lane wrote: How are you going to "mark the standby as degraded"? The standby can't keep that information, because it's not even connected when the master makes the decision. From a high level, I'm assuming only that the master has a list in memory of the standby system(s) it believes are

Re: [HACKERS] Issues with Quorum Commit

2010-10-08 Thread Greg Smith
Markus Wanner wrote: ..and how do you make sure you are not marking your second standby as degraded just because it's currently lagging? Effectively degrading the utterly needed one, because your first standby has just bitten the dust? People are going to monitor the standby lag. If it gets

Re: [HACKERS] Issues with Quorum Commit

2010-10-08 Thread Heikki Linnakangas
On 08.10.2010 17:26, Fujii Masao wrote: On Fri, Oct 8, 2010 at 5:10 PM, Heikki Linnakangas wrote: Do we really need that? Yes. But if there is no unsent WAL when the master goes down, we can start new standby without new backup by copying the timeline history file from new master to new stan

Re: [HACKERS] Issues with Quorum Commit

2010-10-08 Thread Rob Wultsch
* On 10/8/10, Fujii Masao wrote: > On Fri, Oct 8, 2010 at 5:10 PM, Heikki Linnakangas > wrote: >> Do we really need that? > > Yes. But if there is no unsent WAL when the master goes down, > we can start new standby without new backup by copying the > timeline history file from new master to new

Re: [HACKERS] Issues with Quorum Commit

2010-10-08 Thread Josh Berkus
And, I'd like to know whether the master waits forever because of the standby failure in other solutions such as Oracle DataGuard, MySQL semi-synchronous replication. MySQL used to be fond of simiply failing sliently. Not sure what 5.4 does, or Oracle. In any case MySQL's replication has al

Re: [HACKERS] Issues with Quorum Commit

2010-10-08 Thread Markus Wanner
On 10/08/2010 04:48 PM, Fujii Masao wrote: > I believe many systems require write-availability. Sure. Make sure you have enough standbies to fail over to. (I think there are even more situations where read-availability is much more important, though). >> Start with 0 (i.e. replication off), then

Re: [HACKERS] Issues with Quorum Commit

2010-10-08 Thread Markus Wanner
On 10/08/2010 04:47 PM, Simon Riggs wrote: > Yes, I really want to avoid such issues and likely complexities we get > into trying to solve them. In reality they should not be common because > it only happens if the sysadmin has not configured sufficient number of > redundant standbys. Well, full c

Re: [HACKERS] Issues with Quorum Commit

2010-10-08 Thread Simon Riggs
On Fri, 2010-10-08 at 23:55 +0900, Fujii Masao wrote: > On Fri, Oct 8, 2010 at 6:00 PM, Simon Riggs wrote: > > >From the perspective of an observer, randomly selecting a standby for > > load balancing purposes: No, they are not guaranteed to see the "latest" > > answer, nor even can they find out

Re: [HACKERS] Issues with Quorum Commit

2010-10-08 Thread Fujii Masao
On Fri, Oct 8, 2010 at 6:00 PM, Simon Riggs wrote: > >From the perspective of an observer, randomly selecting a standby for > load balancing purposes: No, they are not guaranteed to see the "latest" > answer, nor even can they find out whether what they are seeing is the > latest answer. To guara

Re: [HACKERS] Issues with Quorum Commit

2010-10-08 Thread Fujii Masao
On Fri, Oct 8, 2010 at 5:16 PM, Markus Wanner wrote: > On 10/08/2010 05:41 AM, Fujii Masao wrote: >> But, even with quorum commit, if you choose wait-forever option, >> failover would decrease availability. Right after the failover, >> no standby has connected to new master, so if quorum >= 1, all

Re: [HACKERS] Issues with Quorum Commit

2010-10-08 Thread Simon Riggs
On Fri, 2010-10-08 at 10:11 -0400, Tom Lane wrote: > 1. a unique identifier for each standby (not just role names that > multiple standbys might share); That is difficult because each standby is identical. If a standby goes down, people can regenerate a new standby by taking a copy from another s

Re: [HACKERS] Issues with Quorum Commit

2010-10-08 Thread Markus Wanner
On 10/08/2010 04:38 PM, Tom Lane wrote: > Markus Wanner writes: >> IIUC you seem to assume that the master node keeps its master role. But >> users who value availability a lot certainly want automatic fail-over, > > Huh? Surely loss of the slaves shouldn't force a failover. Maybe the > slaves

Re: [HACKERS] Issues with Quorum Commit

2010-10-08 Thread Tom Lane
Markus Wanner writes: > On 10/08/2010 04:11 PM, Tom Lane wrote: >> Actually, #2 seems rather difficult even if you want it. Presumably >> you'd like to keep that state in reliable storage, so it survives master >> crashes. But how you gonna commit a change to that state, if you just >> lost ever

Re: [HACKERS] Issues with Quorum Commit

2010-10-08 Thread Markus Wanner
On 10/08/2010 12:05 PM, Dimitri Fontaine wrote: > Markus Wanner writes: >> ..and a whole lot of manual work, that's prone to error for something >> that could easily be automated > > So, the master just crashed, first standby is dead and second ain't in > sync. What's the easy and automated way o

Re: [HACKERS] Issues with Quorum Commit

2010-10-08 Thread Dimitri Fontaine
Tom Lane writes: > Well, actually, that's *considerably* more complicated than just a > timeout. How are you going to "mark the standby as degraded"? The > standby can't keep that information, because it's not even connected > when the master makes the decision. ISTM that this requires > > 1. a

Re: [HACKERS] Issues with Quorum Commit

2010-10-08 Thread Markus Wanner
On 10/08/2010 04:11 PM, Tom Lane wrote: > Actually, #2 seems rather difficult even if you want it. Presumably > you'd like to keep that state in reliable storage, so it survives master > crashes. But how you gonna commit a change to that state, if you just > lost every standby (suppose master's e

Re: [HACKERS] Issues with Quorum Commit

2010-10-08 Thread Fujii Masao
On Fri, Oct 8, 2010 at 5:10 PM, Heikki Linnakangas wrote: > Do we really need that? Yes. But if there is no unsent WAL when the master goes down, we can start new standby without new backup by copying the timeline history file from new master to new standby and setting recovery_target_timeline to

Re: [HACKERS] Issues with Quorum Commit

2010-10-08 Thread Tom Lane
Greg Smith writes: > I don't see this as needing any implementation any more complicated than > the usual way such timeouts are handled. Note how long you've been > trying to reach the standby. Default to -1 for forever. And if you hit > the timeout, mark the standby as degraded and force th

Re: [HACKERS] Issues with Quorum Commit

2010-10-08 Thread Fujii Masao
On Fri, Oct 8, 2010 at 5:07 PM, Markus Wanner wrote: > On 10/08/2010 04:01 AM, Fujii Masao wrote: >> Really? I don't think that ko-count=0 means "wait-forever". > > Telling from the documentation, I'd also say it doesn't wait forever by > default. However, please note that there are different para

Re: [HACKERS] Issues with Quorum Commit

2010-10-08 Thread Dimitri Fontaine
Markus Wanner writes: > ..and a whole lot of manual work, that's prone to error for something > that could easily be automated So, the master just crashed, first standby is dead and second ain't in sync. What's the easy and automated way out? Sorry, I need a hand here. -- Dimitri Fontaine http:

Re: [HACKERS] Issues with Quorum Commit

2010-10-08 Thread Markus Wanner
On 10/08/2010 11:41 AM, Dimitri Fontaine wrote: > Same old story. Either you're able to try and fix the master so that you > don't lose any data and don't even have to check for that, or you take a > risk and start from a non synced standby. It's all availability against > durability again. ..and

Re: [HACKERS] Issues with Quorum Commit

2010-10-08 Thread Dimitri Fontaine
Markus Wanner writes: > ..and how do you make sure you are not marking your second standby as > degraded just because it's currently lagging? Well, in sync rep, a standby that's not able to stay under the timeout is degraded. Full stop. The presence of the timeout (or its value not being -1) mea

Re: [HACKERS] Issues with Quorum Commit

2010-10-08 Thread Markus Wanner
On 10/08/2010 11:00 AM, Simon Riggs wrote: > From the perspective of an observer, randomly selecting a standby for > load balancing purposes: No, they are not guaranteed to see the "latest" > answer, nor even can they find out whether what they are seeing is the > latest answer. I completely agree

Re: [HACKERS] Issues with Quorum Commit

2010-10-08 Thread Markus Wanner
On 10/08/2010 01:44 AM, Greg Smith wrote: > They'll use Sync Rep to maximize > the odds a system failure doesn't cause any transaction loss. They'll > use good quality hardware on the master so it's unlikely to fail. .."unlikely to fail"? Ehm.. is that you speaking, Greg? ;-) > But > when the d

Re: [HACKERS] Issues with Quorum Commit

2010-10-08 Thread Simon Riggs
On Fri, 2010-10-08 at 11:27 +0300, Heikki Linnakangas wrote: > On 08.10.2010 11:25, Simon Riggs wrote: > > On Fri, 2010-10-08 at 10:56 +0300, Heikki Linnakangas wrote: > >>> > >>> Or what kind of customers do you think really need a no-lag solution for > >>> read-only queries? In the LAN case, the

Re: [HACKERS] Issues with Quorum Commit

2010-10-08 Thread Markus Wanner
On 10/08/2010 09:56 AM, Heikki Linnakangas wrote: > Imagine a web application that's mostly read-only, but a > user can modify his own personal details like name and address, for > example. Imagine that the user changes his street address and clicks > 'save', causing an UPDATE, and the next query f

Re: [HACKERS] Issues with Quorum Commit

2010-10-08 Thread Markus Wanner
On 10/08/2010 10:27 AM, Heikki Linnakangas wrote: > Synchronous replication in the 'replay' mode is supposed to guarantee > exactly that, no? The master may lag behind, so it's not strictly speaking the same data. Regards Markus Wanner -- Sent via pgsql-hackers mailing list (pgsql-hackers@pos

Re: [HACKERS] Issues with Quorum Commit

2010-10-08 Thread Heikki Linnakangas
On 08.10.2010 11:25, Simon Riggs wrote: On Fri, 2010-10-08 at 10:56 +0300, Heikki Linnakangas wrote: Or what kind of customers do you think really need a no-lag solution for read-only queries? In the LAN case, the lag of async rep is negligible and in the WAN case the latencies of sync rep are

Re: [HACKERS] Issues with Quorum Commit

2010-10-08 Thread Simon Riggs
On Fri, 2010-10-08 at 10:56 +0300, Heikki Linnakangas wrote: > > > > Or what kind of customers do you think really need a no-lag solution for > > read-only queries? In the LAN case, the lag of async rep is negligible > > and in the WAN case the latencies of sync rep are prohibitive. > > There is a

Re: [HACKERS] Issues with Quorum Commit

2010-10-08 Thread Heikki Linnakangas
On 08.10.2010 01:25, Simon Riggs wrote: On Thu, 2010-10-07 at 13:44 -0400, Aidan Van Dyk wrote: To get "non-stale" responses, you can only query those k=3 servers. But you've shot your self in the foot because you don't know which 3/10 those will be. The other 7 *are* stale (by definition). T

Re: [HACKERS] Issues with Quorum Commit

2010-10-08 Thread Markus Wanner
On 10/08/2010 05:41 AM, Fujii Masao wrote: > But, even with quorum commit, if you choose wait-forever option, > failover would decrease availability. Right after the failover, > no standby has connected to new master, so if quorum >= 1, all > the transactions must wait for a while. That's a point,

Re: [HACKERS] Issues with Quorum Commit

2010-10-08 Thread Simon Riggs
On Fri, 2010-10-08 at 09:52 +0200, Markus Wanner wrote: > One addendum: a timeout increases availability at the cost of > increased danger of data loss and higher complexity. Don't use it, > just increase (N - k) instead. Completely agree. -- Simon Riggs www.2ndQuadrant.com Postgre

Re: [HACKERS] Issues with Quorum Commit

2010-10-08 Thread Heikki Linnakangas
On 08.10.2010 06:41, Fujii Masao wrote: On Thu, Oct 7, 2010 at 3:01 AM, Markus Wanner wrote: Of course, it doesn't make sense to wait-forever on *every* standby that ever gets added. Quorum commit is required, yes (and that's what this thread is about, IIRC). But with quorum commit, adding a st

Re: [HACKERS] Issues with Quorum Commit

2010-10-08 Thread Markus Wanner
On 10/08/2010 04:01 AM, Fujii Masao wrote: > Really? I don't think that ko-count=0 means "wait-forever". Telling from the documentation, I'd also say it doesn't wait forever by default. However, please note that there are different parameters for the initial wait for connection during boot up (wfc

Re: [HACKERS] Issues with Quorum Commit

2010-10-08 Thread Heikki Linnakangas
On 07.10.2010 21:38, Markus Wanner wrote: On 10/07/2010 03:19 PM, Dimitri Fontaine wrote: I think you're all into durability, and that's good. The extra cost is service downtime It's just *reduced* availability. That doesn't necessarily mean downtime, if you combine cleverly with async replica

Re: [HACKERS] Issues with Quorum Commit

2010-10-08 Thread Markus Wanner
Simon, On 10/08/2010 12:25 AM, Simon Riggs wrote: > Asking for k > 1 does *not* mean those servers are time synchronised. Yes, it's technically impossible to create a fully synchronized cluster (on the basis of shared-nothing nodes we are aiming for, that is). There always is some kind of "lag" o

Re: [HACKERS] Issues with Quorum Commit

2010-10-08 Thread Markus Wanner
On 10/08/2010 12:30 AM, Simon Riggs wrote: > I do, but its not a parameter. The k = 1 behaviour is hardcoded and > considerably simplifies the design. Moving to k > 1 is additional work, > slows things down and seems likely to be fragile. Perfect! So I'm all in favor of committing that, but leavin

Re: [HACKERS] Issues with Quorum Commit

2010-10-08 Thread Dimitri Fontaine
Greg Smith writes: […] > I don't see this as needing any implementation any more complicated than the > usual way such timeouts are handled. Note how long you've been trying to > reach the standby. Default to -1 for forever. And if you hit the timeout, > mark the standby as degraded and force t

Re: [HACKERS] Issues with Quorum Commit

2010-10-07 Thread Fujii Masao
On Fri, Oct 8, 2010 at 8:44 AM, Greg Smith wrote: > Additional code?  Yes.  Foot-gun?  Yes.  Timeout should be disabled by > default so that you get wait forever unless you ask for something different? >  Probably.  Unneeded?  This is where we don't agree anymore.  The example > that Josh Berkus j

Re: [HACKERS] Issues with Quorum Commit

2010-10-07 Thread Joshua D. Drake
On Thu, 2010-10-07 at 19:44 -0400, Greg Smith wrote: > I don't see this as needing any implementation any more complicated than > the usual way such timeouts are handled. Note how long you've been > trying to reach the standby. Default to -1 for forever. And if you hit > the timeout, mark th

Re: [HACKERS] Issues with Quorum Commit

2010-10-07 Thread Fujii Masao
On Thu, Oct 7, 2010 at 3:01 AM, Markus Wanner wrote: > Of course, it doesn't make sense to wait-forever on *every* standby that > ever gets added. Quorum commit is required, yes (and that's what this > thread is about, IIRC). But with quorum commit, adding a standby only > improves availability, b

Re: [HACKERS] Issues with Quorum Commit

2010-10-07 Thread Fujii Masao
On Thu, Oct 7, 2010 at 5:01 AM, Simon Riggs wrote: > You seem willing to trade anything for that guarantee. I seek a more > pragmatic approach that balances availability and risk. > > Those views are different, but not inconsistent. Oracle manages to offer > multiple options and so can we. +1 Re

Re: [HACKERS] Issues with Quorum Commit

2010-10-07 Thread Fujii Masao
On Wed, Oct 6, 2010 at 9:22 PM, Dimitri Fontaine wrote: > From my experience operating londiste, those states would be: > >  1. base-backup  — self explaining >  2. catch-up     — getting the WAL to catch up after base backup >  3. wanna-sync   — don't yet have all the WAL to get in sync >  4. do-

Re: [HACKERS] Issues with Quorum Commit

2010-10-07 Thread Robert Haas
On Thu, Oct 7, 2010 at 10:24 PM, Fujii Masao wrote: > On Wed, Oct 6, 2010 at 6:00 PM, Heikki Linnakangas > wrote: >> In general, salvaging the WAL that was not sent to the standby yet is >> outright impossible. You can't achieve zero data loss with asynchronous >> replication at all. > > No. That

Re: [HACKERS] Issues with Quorum Commit

2010-10-07 Thread Fujii Masao
On Wed, Oct 6, 2010 at 6:00 PM, Heikki Linnakangas wrote: > In general, salvaging the WAL that was not sent to the standby yet is > outright impossible. You can't achieve zero data loss with asynchronous > replication at all. No. That depends on the type of failure. Unless the disk in the master

Re: [HACKERS] Issues with Quorum Commit

2010-10-07 Thread Fujii Masao
On Wed, Oct 6, 2010 at 6:11 PM, Markus Wanner wrote: > Yeah, sounds more likely. Then I'm surprised that I didn't find any > warning that the Protocol C definitely reduces availability (with the > ko-count=0 default, that is). Really? I don't think that ko-count=0 means "wait-forever". IIRC, when

Re: [HACKERS] Issues with Quorum Commit

2010-10-07 Thread Greg Smith
Markus Wanner wrote: So far I've been under the impression that Simon already has the code for quorum_commit k = 1. What I'm opposing to is the timeout "feature", which I consider to be additional code, unneeded complexity and foot-gun. Additional code? Yes. Foot-gun? Yes. Timeout shoul

Re: [HACKERS] Issues with Quorum Commit

2010-10-07 Thread Josh Berkus
All, > Establishing an affinity between a session and one of the database > servers will only help if the traffic is strictly read-only. I think this thread has drifted very far away from anything we're going to do for 9.1. And seems to have little to do with synchronous replication. Synch rep

Re: [HACKERS] Issues with Quorum Commit

2010-10-07 Thread Simon Riggs
On Thu, 2010-10-07 at 19:50 +0200, Markus Wanner wrote: > So far I've been under the impression that Simon already has the code > for quorum_commit k = 1. I do, but its not a parameter. The k = 1 behaviour is hardcoded and considerably simplifies the design. Moving to k > 1 is additional work, sl

Re: [HACKERS] Issues with Quorum Commit

2010-10-07 Thread Simon Riggs
On Thu, 2010-10-07 at 13:44 -0400, Aidan Van Dyk wrote: > To get "non-stale" responses, you can only query those k=3 servers. > But you've shot your self in the foot because you don't know which > 3/10 those will be. The other 7 *are* stale (by definition). They > talk about picking the "caught

Re: [HACKERS] Issues with Quorum Commit

2010-10-07 Thread Kevin Grittner
Robert Haas wrote: > Establishing an affinity between a session and one of the database > servers will only help if the traffic is strictly read-only. Thanks; I now see your point. In our environment, that's pretty common. Our most heavily used web app (the one for which we have, at times,

Re: [HACKERS] Issues with Quorum Commit

2010-10-07 Thread Dimitri Fontaine
Markus Wanner writes: > I don't buy that. The risk calculation gets a lot simpler and obvious > with strict guarantees. Ok, I'm lost in the use cases and analysis. I still don't understand why you want to consider the system already synchronous when it's not, whatever is the guarantee you're as

Re: [HACKERS] Issues with Quorum Commit

2010-10-07 Thread Robert Haas
On Thu, Oct 7, 2010 at 2:31 PM, Kevin Grittner wrote: > Robert Haas wrote: >> Kevin Grittner wrote: > >>> With web applications, at least, you often don't care that the >>> data read is absolutely up-to-date, as long as the point in time >>> doesn't jump around from one request to the next.  Whe

Re: [HACKERS] Issues with Quorum Commit

2010-10-07 Thread Markus Wanner
On 10/07/2010 07:44 PM, Aidan Van Dyk wrote: > The only case I see a "race to quorum" type of k < N being useful is > if you're just trying to duplicate data everywhere, but not actually > querying any of the replicas. I can see that "all queries go to the > master, but the chances are pretty high

Re: [HACKERS] Issues with Quorum Commit

2010-10-07 Thread Markus Wanner
On 10/07/2010 03:19 PM, Dimitri Fontaine wrote: > I think you're all into durability, and that's good. The extra cost is > service downtime It's just *reduced* availability. That doesn't necessarily mean downtime, if you combine cleverly with async replication. > if that's not what you're after:

Re: [HACKERS] Issues with Quorum Commit

2010-10-07 Thread Kevin Grittner
Robert Haas wrote: > Kevin Grittner wrote: >> With web applications, at least, you often don't care that the >> data read is absolutely up-to-date, as long as the point in time >> doesn't jump around from one request to the next. When we have >> used load balancing between multiple database se

Re: [HACKERS] Issues with Quorum Commit

2010-10-07 Thread Robert Haas
On Thu, Oct 7, 2010 at 2:10 PM, Kevin Grittner wrote: > Aidan Van Dyk wrote: > >> To get "non-stale" responses, you can only query those k=3 >> servers.  But you've shot your self in the foot because you don't >> know which 3/10 those will be.  The other 7 *are* stale (by >> definition). They tal

Re: [HACKERS] Issues with Quorum Commit

2010-10-07 Thread Kevin Grittner
Aidan Van Dyk wrote: > To get "non-stale" responses, you can only query those k=3 > servers. But you've shot your self in the foot because you don't > know which 3/10 those will be. The other 7 *are* stale (by > definition). They talk about picking the "caught up" slave when > the master fails

Re: [HACKERS] Issues with Quorum Commit

2010-10-07 Thread Josh Berkus
> But as a practical matter, I'm afraid the true cost of the better > guarantee you're suggesting here is additional code complexity that will > likely cause this feature to miss 9.1 altogether. As far as I'm > concerned, this whole diversion into the topic of quorum commit is only > consuming re

Re: [HACKERS] Issues with Quorum Commit

2010-10-07 Thread Markus Wanner
On 10/07/2010 06:41 PM, Greg Smith wrote: > The cost of hardware capable of running a database server is a large > multiple of what you can build an alerting machine for. You realize you don't need lots of disks nor RAM for a box that only ACKs? A box with two SAS disks and a BBU isn't that expens

Re: [HACKERS] Issues with Quorum Commit

2010-10-07 Thread Josh Berkus
> If you want "synchronous replication" because you want "query > availabilty" while making sure you're not getting "stale" queries from > all your slaves, than using your k < N (k = 3 and N - 10) situation is > screwing your self. Correct. If that is your reason for synch standby, then you shoul

Re: [HACKERS] Issues with Quorum Commit

2010-10-07 Thread Aidan Van Dyk
On Thu, Oct 7, 2010 at 1:22 PM, Josh Berkus wrote: > So if you have k = 3 and N = 10, then you can have 10 standbys and only > 3 of them need to ack any specific commit for the master to proceed. As > long as (a) you retain at least one of the 3 which ack'd, and (b) you > have some way of determi

Re: [HACKERS] Issues with Quorum Commit

2010-10-07 Thread Josh Berkus
On 10/7/10 6:41 AM, Aidan Van Dyk wrote: > I'm really confused with all this k < N scenarious I see bandied > about, because, all it really amounts to is "I only want *one* > syncronous replication, and a bunch of synchrounous replications". > And a bit of chance thrown in the mix to hope the "sync

Re: [HACKERS] Issues with Quorum Commit

2010-10-07 Thread Greg Smith
Markus Wanner wrote: I think that's a pretty special case, because the "good alerting system" is at least as expensive as another server that just persistently stores and ACKs incoming WAL. The cost of hardware capable of running a database server is a large multiple of what you can build a

Re: [HACKERS] Issues with Quorum Commit

2010-10-07 Thread Dimitri Fontaine
Aidan Van Dyk writes: > *shrug* The joining standby is still asynchronous at this point. > It's not synchronous replication. It's just another ^k of the N > slaves serving stale data ;-) Agreed *here*, but if you read the threads again, you'll see that's not at all what's been talked about befo

Re: [HACKERS] Issues with Quorum Commit

2010-10-07 Thread Aidan Van Dyk
On Thu, Oct 7, 2010 at 10:08 AM, Dimitri Fontaine wrote: > Aidan Van Dyk writes: >> Sure, but that lagged standy is already asynchrounous, not >> synchrounous.  If it was synchronous, it would have slowed the master >> down enough it would not be lagged. > > Agreed, except in the case of a joinin

Re: [HACKERS] Issues with Quorum Commit

2010-10-07 Thread Dimitri Fontaine
Aidan Van Dyk writes: > Sure, but that lagged standy is already asynchrounous, not > synchrounous. If it was synchronous, it would have slowed the master > down enough it would not be lagged. Agreed, except in the case of a joining standby. But you're saying it better than I do: > Yes, I believ

Re: [HACKERS] Issues with Quorum Commit

2010-10-07 Thread Aidan Van Dyk
On Thu, Oct 7, 2010 at 6:32 AM, Dimitri Fontaine wrote: > Or if the standby is lagging and the master wal_keep_segments is not > sized big enough. Is that a catastrophic loss of the standby too? Sure, but that lagged standy is already asynchrounous, not synchrounous. If it was synchronous, it w

Re: [HACKERS] Issues with Quorum Commit

2010-10-07 Thread Dimitri Fontaine
Markus Wanner writes: > Why does one ever want the guarantee that sync replication gives to only > hold true up to one failure, if a better guarantee doesn't cost anything > extra? (Note that a "good alerting system" is impossible to achieve with > only two servers. You need a third device anyway)

Re: [HACKERS] Issues with Quorum Commit

2010-10-07 Thread Markus Wanner
Salut Dimitri, On 10/07/2010 12:32 PM, Dimitri Fontaine wrote: > Another one is to say that I want sync rep when the standby is > available, but I don't have the budget for more. So I prefer a good > alerting system and low-budget-no-guarantee when the standby is down, > that's my risk evaluation.

Re: [HACKERS] Issues with Quorum Commit

2010-10-07 Thread Robert Haas
On Thu, Oct 7, 2010 at 3:30 AM, Simon Riggs wrote: > Yes, lets get k = 1 first. > > With k = 1 the number of standbys is not limited, so we can still have > very robust and highly available architectures. So we mean > "first-acknowledgement-releases-waiters". +1. I like the design Greg Smith pro

Re: [HACKERS] Issues with Quorum Commit

2010-10-07 Thread Markus Wanner
On 10/07/2010 01:08 PM, Simon Riggs wrote: > Adding timeout is very little code. We can take that out of the patch if > that's an objection. Okay. If you take it out, we are at the wait-forever option, right? If not, I definitely don't understand how you envision things to happen. I've been askin

Re: [HACKERS] Issues with Quorum Commit

2010-10-07 Thread Simon Riggs
On Thu, 2010-10-07 at 11:46 +0200, Markus Wanner wrote: > On 10/06/2010 10:01 PM, Simon Riggs wrote: > > The code to implement your desired option is > > more complex and really should come later. > > I'm sorry, but I think of that exactly the opposite way. I see why you say that. Dimitri's sugg

Re: [HACKERS] Issues with Quorum Commit

2010-10-07 Thread Dimitri Fontaine
Heikki Linnakangas writes: > Either that, or you configure your system for asynchronous replication > first, and flip the switch to synchronous only after the standby has caught > up. Setting up the first standby happens only once when you initially set up > the system, or if you're recovering fro

Re: [HACKERS] Issues with Quorum Commit

2010-10-07 Thread Heikki Linnakangas
On 07.10.2010 12:52, Dimitri Fontaine wrote: Markus Wanner writes: I'm just saying that this should be an option, not the only choice. I'm sorry, I just don't see the use case for a mode that drops guarantees when they are most needed. People who don't need those guarantees should definitely

Re: [HACKERS] Issues with Quorum Commit

2010-10-07 Thread Dimitri Fontaine
Markus Wanner writes: >> I'm just saying that this should be an option, not the only choice. > > I'm sorry, I just don't see the use case for a mode that drops > guarantees when they are most needed. People who don't need those > guarantees should definitely go for async replication instead. We'r

Re: [HACKERS] Issues with Quorum Commit

2010-10-07 Thread Markus Wanner
On 10/06/2010 10:01 PM, Simon Riggs wrote: > The code to implement your desired option is > more complex and really should come later. I'm sorry, but I think of that exactly the opposite way. The timeout for automatic continuation after waiting for a standby is the addition. The wait state of the

Re: [HACKERS] Issues with Quorum Commit

2010-10-07 Thread Simon Riggs
On Wed, 2010-10-06 at 10:57 -0700, Josh Berkus wrote: > (2), (3) Degradation: (Jeff) these two cases make sense only if we > give > DBAs the tools they need to monitor which standbys are falling behind, > and to drop and replace those standbys. Otherwise we risk giving DBAs > false confidence that

Re: [HACKERS] Issues with Quorum Commit

2010-10-07 Thread Simon Riggs
On Wed, 2010-10-06 at 10:57 -0700, Josh Berkus wrote: > I also strongly believe that we should get single-standby > functionality committed and tested *first*, before working further on > multi-standby. Yes, lets get k = 1 first. With k = 1 the number of standbys is not limited, so we can still

Re: [HACKERS] Issues with Quorum Commit

2010-10-06 Thread Simon Riggs
On Wed, 2010-10-06 at 18:04 +0300, Heikki Linnakangas wrote: > The key is whether you are guaranteed to have zero data loss or not. We agree that is an important question. You seem willing to trade anything for that guarantee. I seek a more pragmatic approach that balances availability and risk.

Re: [HACKERS] Issues with Quorum Commit

2010-10-06 Thread Josh Berkus
> Seems reasonable, but what is a CAP database? Database based around the CAP theorem[1]. Cassandra, Dynamo, Hypertable, etc. For us, the equation is: CAD, as in Consistency, Availability, Durability. Pick any two, at best. But it's a very similar bag of issues as the ones CAP addresses. [1]

Re: [HACKERS] Issues with Quorum Commit

2010-10-06 Thread Markus Wanner
On 10/06/2010 09:04 PM, Dimitri Fontaine wrote: > Ok so I think we're agreeing here: what I said amounts to propose that > the code does work this way when the quorum is such setup, and/or is > able to reject any non-read-only transaction (those that needs a real > XID) until your standby is fully

Re: [HACKERS] Issues with Quorum Commit

2010-10-06 Thread Heikki Linnakangas
On 06.10.2010 20:57, Josh Berkus wrote: While it's nice to dismiss case (1) as an edge-case, consider the likelyhood of someone running PostgreSQL with fsync=off on cloud hosting. In that case, having k = N = 5 does not seem like an unreasonable arrangement if you want to ensure durability via r

Re: [HACKERS] Issues with Quorum Commit

2010-10-06 Thread Dimitri Fontaine
Markus Wanner writes: > There's no point in time I > ever mind if a standby is a "candidate" or not. Either I want to > synchronously replicate to X standbies, or not. Ok so I think we're agreeing here: what I said amounts to propose that the code does work this way when the quorum is such setup,

  1   2   >