Tom Lane wrote:
Greg Smith g...@2ndquadrant.com writes:
I don't see this as needing any implementation any more complicated than
the usual way such timeouts are handled. Note how long you've been
trying to reach the standby. Default to -1 for forever. And if you hit
the timeout,
On Tue, Oct 12, 2010 at 11:50 PM, Robert Haas robertmh...@gmail.com wrote:
There's another problem here we should think about, too. Suppose you
have a master and two standbys. The master dies. You promote one of
the standbys, which turns out to be behind the other. You then
repoint the
On Wed, Oct 13, 2010 at 5:22 AM, Fujii Masao masao.fu...@gmail.com wrote:
On Wed, Oct 13, 2010 at 3:50 PM, Robert Haas robertmh...@gmail.com wrote:
There's another problem here we should think about, too. Suppose you
have a master and two standbys. The master dies. You promote one of
the
On Thu, Oct 14, 2010 at 11:18 AM, Greg Stark gsst...@mit.edu wrote:
Why don't the usual protections kick in here? The new record read from
the location the xlog reader is expecting to find it has to have a
valid CRC and a correct back pointer to the previous record.
Yep. In most cases, those
On Wed, Oct 13, 2010 at 10:18 PM, Greg Stark gsst...@mit.edu wrote:
On Tue, Oct 12, 2010 at 11:50 PM, Robert Haas robertmh...@gmail.com wrote:
There's another problem here we should think about, too. Suppose you
have a master and two standbys. The master dies. You promote one of
the
On 13.10.2010 08:21, Fujii Masao wrote:
On Sat, Oct 9, 2010 at 4:31 AM, Heikki Linnakangas
heikki.linnakan...@enterprisedb.com wrote:
It shouldn't be too hard to fix. Walsender needs to be able to read WAL from
preceding timelines, like recovery does, and walreceiver needs to write the
On Wed, Oct 13, 2010 at 2:43 AM, Heikki Linnakangas
heikki.linnakan...@enterprisedb.com wrote:
On 13.10.2010 08:21, Fujii Masao wrote:
On Sat, Oct 9, 2010 at 4:31 AM, Heikki Linnakangas
heikki.linnakan...@enterprisedb.com wrote:
It shouldn't be too hard to fix. Walsender needs to be able to
On 10/13/2010 06:43 AM, Fujii Masao wrote:
Unfortunately even enough standbys don't increase write-availability
unless you choose wait-forever. Because, after promoting one of
standbys to new master, you must keep all the transactions waiting
until at least one standby has connected to and
On Wed, Oct 13, 2010 at 3:43 PM, Heikki Linnakangas
heikki.linnakan...@enterprisedb.com wrote:
On 13.10.2010 08:21, Fujii Masao wrote:
On Sat, Oct 9, 2010 at 4:31 AM, Heikki Linnakangas
heikki.linnakan...@enterprisedb.com wrote:
It shouldn't be too hard to fix. Walsender needs to be able to
On Wed, Oct 13, 2010 at 3:50 PM, Robert Haas robertmh...@gmail.com wrote:
There's another problem here we should think about, too. Suppose you
have a master and two standbys. The master dies. You promote one of
the standbys, which turns out to be behind the other. You then
repoint the
On Sat, Oct 9, 2010 at 12:12 AM, Markus Wanner mar...@bluegap.ch wrote:
On 10/08/2010 04:48 PM, Fujii Masao wrote:
I believe many systems require write-availability.
Sure. Make sure you have enough standbies to fail over to.
Unfortunately even enough standbys don't increase write-availability
On Sat, Oct 9, 2010 at 1:41 AM, Josh Berkus j...@agliodbs.com wrote:
And, I'd like to know whether the master waits forever because of the
standby failure in other solutions such as Oracle DataGuard, MySQL
semi-synchronous replication.
MySQL used to be fond of simiply failing sliently. Not
On Sat, Oct 9, 2010 at 4:31 AM, Heikki Linnakangas
heikki.linnakan...@enterprisedb.com wrote:
Yes. But if there is no unsent WAL when the master goes down,
we can start new standby without new backup by copying the
timeline history file from new master to new standby and
setting
Greg,
to me it looks like we have very similar goals, but start from different
preconditions. I absolutely agree with you given the preconditions you
named.
On 10/08/2010 10:04 PM, Greg Smith wrote:
How is that a new problem? It's already possible to end up with a
standby pair that has
Greg Smith g...@2ndquadrant.com writes:
[…]
I don't see this as needing any implementation any more complicated than the
usual way such timeouts are handled. Note how long you've been trying to
reach the standby. Default to -1 for forever. And if you hit the timeout,
mark the standby as
On 10/08/2010 12:30 AM, Simon Riggs wrote:
I do, but its not a parameter. The k = 1 behaviour is hardcoded and
considerably simplifies the design. Moving to k 1 is additional work,
slows things down and seems likely to be fragile.
Perfect! So I'm all in favor of committing that, but leaving
Simon,
On 10/08/2010 12:25 AM, Simon Riggs wrote:
Asking for k 1 does *not* mean those servers are time synchronised.
Yes, it's technically impossible to create a fully synchronized cluster
(on the basis of shared-nothing nodes we are aiming for, that is). There
always is some kind of lag on
On 07.10.2010 21:38, Markus Wanner wrote:
On 10/07/2010 03:19 PM, Dimitri Fontaine wrote:
I think you're all into durability, and that's good. The extra cost is
service downtime
It's just *reduced* availability. That doesn't necessarily mean
downtime, if you combine cleverly with async
On 10/08/2010 04:01 AM, Fujii Masao wrote:
Really? I don't think that ko-count=0 means wait-forever.
Telling from the documentation, I'd also say it doesn't wait forever by
default. However, please note that there are different parameters for
the initial wait for connection during boot up
On 08.10.2010 06:41, Fujii Masao wrote:
On Thu, Oct 7, 2010 at 3:01 AM, Markus Wannermar...@bluegap.ch wrote:
Of course, it doesn't make sense to wait-forever on *every* standby that
ever gets added. Quorum commit is required, yes (and that's what this
thread is about, IIRC). But with quorum
On Fri, 2010-10-08 at 09:52 +0200, Markus Wanner wrote:
One addendum: a timeout increases availability at the cost of
increased danger of data loss and higher complexity. Don't use it,
just increase (N - k) instead.
Completely agree.
--
Simon Riggs www.2ndQuadrant.com
On 10/08/2010 05:41 AM, Fujii Masao wrote:
But, even with quorum commit, if you choose wait-forever option,
failover would decrease availability. Right after the failover,
no standby has connected to new master, so if quorum = 1, all
the transactions must wait for a while.
That's a point,
On 08.10.2010 01:25, Simon Riggs wrote:
On Thu, 2010-10-07 at 13:44 -0400, Aidan Van Dyk wrote:
To get non-stale responses, you can only query those k=3 servers.
But you've shot your self in the foot because you don't know which
3/10 those will be. The other 7 *are* stale (by definition).
On Fri, 2010-10-08 at 10:56 +0300, Heikki Linnakangas wrote:
Or what kind of customers do you think really need a no-lag solution for
read-only queries? In the LAN case, the lag of async rep is negligible
and in the WAN case the latencies of sync rep are prohibitive.
There is a very
On 08.10.2010 11:25, Simon Riggs wrote:
On Fri, 2010-10-08 at 10:56 +0300, Heikki Linnakangas wrote:
Or what kind of customers do you think really need a no-lag solution for
read-only queries? In the LAN case, the lag of async rep is negligible
and in the WAN case the latencies of sync rep are
On 10/08/2010 10:27 AM, Heikki Linnakangas wrote:
Synchronous replication in the 'replay' mode is supposed to guarantee
exactly that, no?
The master may lag behind, so it's not strictly speaking the same data.
Regards
Markus Wanner
--
Sent via pgsql-hackers mailing list
On 10/08/2010 09:56 AM, Heikki Linnakangas wrote:
Imagine a web application that's mostly read-only, but a
user can modify his own personal details like name and address, for
example. Imagine that the user changes his street address and clicks
'save', causing an UPDATE, and the next query
On Fri, 2010-10-08 at 11:27 +0300, Heikki Linnakangas wrote:
On 08.10.2010 11:25, Simon Riggs wrote:
On Fri, 2010-10-08 at 10:56 +0300, Heikki Linnakangas wrote:
Or what kind of customers do you think really need a no-lag solution for
read-only queries? In the LAN case, the lag of async
On 10/08/2010 01:44 AM, Greg Smith wrote:
They'll use Sync Rep to maximize
the odds a system failure doesn't cause any transaction loss. They'll
use good quality hardware on the master so it's unlikely to fail.
..unlikely to fail?
Ehm.. is that you speaking, Greg? ;-)
But
when the
On 10/08/2010 11:00 AM, Simon Riggs wrote:
From the perspective of an observer, randomly selecting a standby for
load balancing purposes: No, they are not guaranteed to see the latest
answer, nor even can they find out whether what they are seeing is the
latest answer.
I completely agree. The
Markus Wanner mar...@bluegap.ch writes:
..and how do you make sure you are not marking your second standby as
degraded just because it's currently lagging?
Well, in sync rep, a standby that's not able to stay under the timeout
is degraded. Full stop. The presence of the timeout (or its value
On 10/08/2010 11:41 AM, Dimitri Fontaine wrote:
Same old story. Either you're able to try and fix the master so that you
don't lose any data and don't even have to check for that, or you take a
risk and start from a non synced standby. It's all availability against
durability again.
..and a
Markus Wanner mar...@bluegap.ch writes:
..and a whole lot of manual work, that's prone to error for something
that could easily be automated
So, the master just crashed, first standby is dead and second ain't in
sync. What's the easy and automated way out? Sorry, I need a hand here.
--
On Fri, Oct 8, 2010 at 5:07 PM, Markus Wanner mar...@bluegap.ch wrote:
On 10/08/2010 04:01 AM, Fujii Masao wrote:
Really? I don't think that ko-count=0 means wait-forever.
Telling from the documentation, I'd also say it doesn't wait forever by
default. However, please note that there are
Greg Smith g...@2ndquadrant.com writes:
I don't see this as needing any implementation any more complicated than
the usual way such timeouts are handled. Note how long you've been
trying to reach the standby. Default to -1 for forever. And if you hit
the timeout, mark the standby as
On Fri, Oct 8, 2010 at 5:10 PM, Heikki Linnakangas
heikki.linnakan...@enterprisedb.com wrote:
Do we really need that?
Yes. But if there is no unsent WAL when the master goes down,
we can start new standby without new backup by copying the
timeline history file from new master to new standby and
On 10/08/2010 04:11 PM, Tom Lane wrote:
Actually, #2 seems rather difficult even if you want it. Presumably
you'd like to keep that state in reliable storage, so it survives master
crashes. But how you gonna commit a change to that state, if you just
lost every standby (suppose master's
Tom Lane t...@sss.pgh.pa.us writes:
Well, actually, that's *considerably* more complicated than just a
timeout. How are you going to mark the standby as degraded? The
standby can't keep that information, because it's not even connected
when the master makes the decision. ISTM that this
On 10/08/2010 12:05 PM, Dimitri Fontaine wrote:
Markus Wanner mar...@bluegap.ch writes:
..and a whole lot of manual work, that's prone to error for something
that could easily be automated
So, the master just crashed, first standby is dead and second ain't in
sync. What's the easy and
Markus Wanner mar...@bluegap.ch writes:
On 10/08/2010 04:11 PM, Tom Lane wrote:
Actually, #2 seems rather difficult even if you want it. Presumably
you'd like to keep that state in reliable storage, so it survives master
crashes. But how you gonna commit a change to that state, if you just
On 10/08/2010 04:38 PM, Tom Lane wrote:
Markus Wanner mar...@bluegap.ch writes:
IIUC you seem to assume that the master node keeps its master role. But
users who value availability a lot certainly want automatic fail-over,
Huh? Surely loss of the slaves shouldn't force a failover. Maybe
On Fri, 2010-10-08 at 10:11 -0400, Tom Lane wrote:
1. a unique identifier for each standby (not just role names that
multiple standbys might share);
That is difficult because each standby is identical. If a standby goes
down, people can regenerate a new standby by taking a copy from another
On Fri, Oct 8, 2010 at 5:16 PM, Markus Wanner mar...@bluegap.ch wrote:
On 10/08/2010 05:41 AM, Fujii Masao wrote:
But, even with quorum commit, if you choose wait-forever option,
failover would decrease availability. Right after the failover,
no standby has connected to new master, so if
On Fri, Oct 8, 2010 at 6:00 PM, Simon Riggs si...@2ndquadrant.com wrote:
From the perspective of an observer, randomly selecting a standby for
load balancing purposes: No, they are not guaranteed to see the latest
answer, nor even can they find out whether what they are seeing is the
latest
On Fri, 2010-10-08 at 23:55 +0900, Fujii Masao wrote:
On Fri, Oct 8, 2010 at 6:00 PM, Simon Riggs si...@2ndquadrant.com wrote:
From the perspective of an observer, randomly selecting a standby for
load balancing purposes: No, they are not guaranteed to see the latest
answer, nor even can
On 10/08/2010 04:47 PM, Simon Riggs wrote:
Yes, I really want to avoid such issues and likely complexities we get
into trying to solve them. In reality they should not be common because
it only happens if the sysadmin has not configured sufficient number of
redundant standbys.
Well, full
On 10/08/2010 04:48 PM, Fujii Masao wrote:
I believe many systems require write-availability.
Sure. Make sure you have enough standbies to fail over to.
(I think there are even more situations where read-availability is much
more important, though).
Start with 0 (i.e. replication off), then
And, I'd like to know whether the master waits forever because of the
standby failure in other solutions such as Oracle DataGuard, MySQL
semi-synchronous replication.
MySQL used to be fond of simiply failing sliently. Not sure what 5.4
does, or Oracle. In any case MySQL's replication has
*
On 10/8/10, Fujii Masao masao.fu...@gmail.com wrote:
On Fri, Oct 8, 2010 at 5:10 PM, Heikki Linnakangas
heikki.linnakan...@enterprisedb.com wrote:
Do we really need that?
Yes. But if there is no unsent WAL when the master goes down,
we can start new standby without new backup by copying
On 08.10.2010 17:26, Fujii Masao wrote:
On Fri, Oct 8, 2010 at 5:10 PM, Heikki Linnakangas
heikki.linnakan...@enterprisedb.com wrote:
Do we really need that?
Yes. But if there is no unsent WAL when the master goes down,
we can start new standby without new backup by copying the
timeline
Markus Wanner wrote:
..and how do you make sure you are not marking your second standby as
degraded just because it's currently lagging? Effectively degrading the
utterly needed one, because your first standby has just bitten the dust?
People are going to monitor the standby lag. If it
Tom Lane wrote:
How are you going to mark the standby as degraded? The
standby can't keep that information, because it's not even connected
when the master makes the decision.
From a high level, I'm assuming only that the master has a list in
memory of the standby system(s) it believes are
On Fri, 2010-10-08 at 17:06 +0200, Markus Wanner wrote:
Well, full cluster outages are infrequent, but sadly cannot be avoided
entirely. (Murphy's laughing). IMO we should be prepared to deal with
those.
I've described how I propose to deal with those. I'm not waving away
these issues, just
On Fri, 2010-10-08 at 16:34 -0400, Greg Smith wrote:
Tom Lane wrote:
How are you going to mark the standby as degraded? The
standby can't keep that information, because it's not even connected
when the master makes the decision.
From a high level, I'm assuming only that the master has
On Wed, 2010-10-06 at 10:57 -0700, Josh Berkus wrote:
I also strongly believe that we should get single-standby
functionality committed and tested *first*, before working further on
multi-standby.
Yes, lets get k = 1 first.
With k = 1 the number of standbys is not limited, so we can still
On Wed, 2010-10-06 at 10:57 -0700, Josh Berkus wrote:
(2), (3) Degradation: (Jeff) these two cases make sense only if we
give
DBAs the tools they need to monitor which standbys are falling behind,
and to drop and replace those standbys. Otherwise we risk giving DBAs
false confidence that
On 10/06/2010 10:01 PM, Simon Riggs wrote:
The code to implement your desired option is
more complex and really should come later.
I'm sorry, but I think of that exactly the opposite way. The timeout for
automatic continuation after waiting for a standby is the addition. The
wait state of the
Markus Wanner mar...@bluegap.ch writes:
I'm just saying that this should be an option, not the only choice.
I'm sorry, I just don't see the use case for a mode that drops
guarantees when they are most needed. People who don't need those
guarantees should definitely go for async replication
On 07.10.2010 12:52, Dimitri Fontaine wrote:
Markus Wannermar...@bluegap.ch writes:
I'm just saying that this should be an option, not the only choice.
I'm sorry, I just don't see the use case for a mode that drops
guarantees when they are most needed. People who don't need those
guarantees
Heikki Linnakangas heikki.linnakan...@enterprisedb.com writes:
Either that, or you configure your system for asynchronous replication
first, and flip the switch to synchronous only after the standby has caught
up. Setting up the first standby happens only once when you initially set up
the
On Thu, 2010-10-07 at 11:46 +0200, Markus Wanner wrote:
On 10/06/2010 10:01 PM, Simon Riggs wrote:
The code to implement your desired option is
more complex and really should come later.
I'm sorry, but I think of that exactly the opposite way.
I see why you say that. Dimitri's suggestion
On 10/07/2010 01:08 PM, Simon Riggs wrote:
Adding timeout is very little code. We can take that out of the patch if
that's an objection.
Okay. If you take it out, we are at the wait-forever option, right?
If not, I definitely don't understand how you envision things to happen.
I've been asking
On Thu, Oct 7, 2010 at 3:30 AM, Simon Riggs si...@2ndquadrant.com wrote:
Yes, lets get k = 1 first.
With k = 1 the number of standbys is not limited, so we can still have
very robust and highly available architectures. So we mean
first-acknowledgement-releases-waiters.
+1. I like the design
Salut Dimitri,
On 10/07/2010 12:32 PM, Dimitri Fontaine wrote:
Another one is to say that I want sync rep when the standby is
available, but I don't have the budget for more. So I prefer a good
alerting system and low-budget-no-guarantee when the standby is down,
that's my risk evaluation.
I
Markus Wanner mar...@bluegap.ch writes:
Why does one ever want the guarantee that sync replication gives to only
hold true up to one failure, if a better guarantee doesn't cost anything
extra? (Note that a good alerting system is impossible to achieve with
only two servers. You need a third
On Thu, Oct 7, 2010 at 6:32 AM, Dimitri Fontaine dimi...@2ndquadrant.fr wrote:
Or if the standby is lagging and the master wal_keep_segments is not
sized big enough. Is that a catastrophic loss of the standby too?
Sure, but that lagged standy is already asynchrounous, not
synchrounous. If it
Aidan Van Dyk ai...@highrise.ca writes:
Sure, but that lagged standy is already asynchrounous, not
synchrounous. If it was synchronous, it would have slowed the master
down enough it would not be lagged.
Agreed, except in the case of a joining standby. But you're saying it
better than I do:
On Thu, Oct 7, 2010 at 10:08 AM, Dimitri Fontaine
dimi...@2ndquadrant.fr wrote:
Aidan Van Dyk ai...@highrise.ca writes:
Sure, but that lagged standy is already asynchrounous, not
synchrounous. If it was synchronous, it would have slowed the master
down enough it would not be lagged.
Agreed,
Aidan Van Dyk ai...@highrise.ca writes:
*shrug* The joining standby is still asynchronous at this point.
It's not synchronous replication. It's just another ^k of the N
slaves serving stale data ;-)
Agreed *here*, but if you read the threads again, you'll see that's not
at all what's been
Markus Wanner wrote:
I think that's a pretty special case, because the good alerting system
is at least as expensive as another server that just persistently stores
and ACKs incoming WAL.
The cost of hardware capable of running a database server is a large
multiple of what you can build an
On 10/7/10 6:41 AM, Aidan Van Dyk wrote:
I'm really confused with all this k N scenarious I see bandied
about, because, all it really amounts to is I only want *one*
syncronous replication, and a bunch of synchrounous replications.
And a bit of chance thrown in the mix to hope the syncronous
On Thu, Oct 7, 2010 at 1:22 PM, Josh Berkus j...@agliodbs.com wrote:
So if you have k = 3 and N = 10, then you can have 10 standbys and only
3 of them need to ack any specific commit for the master to proceed. As
long as (a) you retain at least one of the 3 which ack'd, and (b) you
have some
If you want synchronous replication because you want query
availabilty while making sure you're not getting stale queries from
all your slaves, than using your k N (k = 3 and N - 10) situation is
screwing your self.
Correct. If that is your reason for synch standby, then you should be
using
On 10/07/2010 06:41 PM, Greg Smith wrote:
The cost of hardware capable of running a database server is a large
multiple of what you can build an alerting machine for.
You realize you don't need lots of disks nor RAM for a box that only
ACKs? A box with two SAS disks and a BBU isn't that
But as a practical matter, I'm afraid the true cost of the better
guarantee you're suggesting here is additional code complexity that will
likely cause this feature to miss 9.1 altogether. As far as I'm
concerned, this whole diversion into the topic of quorum commit is only
consuming
Aidan Van Dyk ai...@highrise.ca wrote:
To get non-stale responses, you can only query those k=3
servers. But you've shot your self in the foot because you don't
know which 3/10 those will be. The other 7 *are* stale (by
definition). They talk about picking the caught up slave when
the
On Thu, Oct 7, 2010 at 2:10 PM, Kevin Grittner
kevin.gritt...@wicourts.gov wrote:
Aidan Van Dyk ai...@highrise.ca wrote:
To get non-stale responses, you can only query those k=3
servers. But you've shot your self in the foot because you don't
know which 3/10 those will be. The other 7 *are*
Robert Haas robertmh...@gmail.com wrote:
Kevin Grittner kevin.gritt...@wicourts.gov wrote:
With web applications, at least, you often don't care that the
data read is absolutely up-to-date, as long as the point in time
doesn't jump around from one request to the next. When we have
used
On 10/07/2010 03:19 PM, Dimitri Fontaine wrote:
I think you're all into durability, and that's good. The extra cost is
service downtime
It's just *reduced* availability. That doesn't necessarily mean
downtime, if you combine cleverly with async replication.
if that's not what you're after:
On 10/07/2010 07:44 PM, Aidan Van Dyk wrote:
The only case I see a race to quorum type of k N being useful is
if you're just trying to duplicate data everywhere, but not actually
querying any of the replicas. I can see that all queries go to the
master, but the chances are pretty high the
On Thu, Oct 7, 2010 at 2:31 PM, Kevin Grittner
kevin.gritt...@wicourts.gov wrote:
Robert Haas robertmh...@gmail.com wrote:
Kevin Grittner kevin.gritt...@wicourts.gov wrote:
With web applications, at least, you often don't care that the
data read is absolutely up-to-date, as long as the point
Markus Wanner mar...@bluegap.ch writes:
I don't buy that. The risk calculation gets a lot simpler and obvious
with strict guarantees.
Ok, I'm lost in the use cases and analysis.
I still don't understand why you want to consider the system already
synchronous when it's not, whatever is the
Robert Haas robertmh...@gmail.com wrote:
Establishing an affinity between a session and one of the database
servers will only help if the traffic is strictly read-only.
Thanks; I now see your point.
In our environment, that's pretty common. Our most heavily used web
app (the one for which
On Thu, 2010-10-07 at 13:44 -0400, Aidan Van Dyk wrote:
To get non-stale responses, you can only query those k=3 servers.
But you've shot your self in the foot because you don't know which
3/10 those will be. The other 7 *are* stale (by definition). They
talk about picking the caught up
On Thu, 2010-10-07 at 19:50 +0200, Markus Wanner wrote:
So far I've been under the impression that Simon already has the code
for quorum_commit k = 1.
I do, but its not a parameter. The k = 1 behaviour is hardcoded and
considerably simplifies the design. Moving to k 1 is additional work,
All,
Establishing an affinity between a session and one of the database
servers will only help if the traffic is strictly read-only.
I think this thread has drifted very far away from anything we're going
to do for 9.1. And seems to have little to do with synchronous replication.
Synch rep
Markus Wanner wrote:
So far I've been under the impression that Simon already has the code
for quorum_commit k = 1.
What I'm opposing to is the timeout feature, which I consider to be
additional code, unneeded complexity and foot-gun.
Additional code? Yes. Foot-gun? Yes. Timeout should
On Wed, Oct 6, 2010 at 6:11 PM, Markus Wanner mar...@bluegap.ch wrote:
Yeah, sounds more likely. Then I'm surprised that I didn't find any
warning that the Protocol C definitely reduces availability (with the
ko-count=0 default, that is).
Really? I don't think that ko-count=0 means
On Wed, Oct 6, 2010 at 6:00 PM, Heikki Linnakangas
heikki.linnakan...@enterprisedb.com wrote:
In general, salvaging the WAL that was not sent to the standby yet is
outright impossible. You can't achieve zero data loss with asynchronous
replication at all.
No. That depends on the type of
On Thu, Oct 7, 2010 at 10:24 PM, Fujii Masao masao.fu...@gmail.com wrote:
On Wed, Oct 6, 2010 at 6:00 PM, Heikki Linnakangas
heikki.linnakan...@enterprisedb.com wrote:
In general, salvaging the WAL that was not sent to the standby yet is
outright impossible. You can't achieve zero data loss
On Wed, Oct 6, 2010 at 9:22 PM, Dimitri Fontaine dimi...@2ndquadrant.fr wrote:
From my experience operating londiste, those states would be:
1. base-backup — self explaining
2. catch-up — getting the WAL to catch up after base backup
3. wanna-sync — don't yet have all the WAL to get
On Thu, Oct 7, 2010 at 5:01 AM, Simon Riggs si...@2ndquadrant.com wrote:
You seem willing to trade anything for that guarantee. I seek a more
pragmatic approach that balances availability and risk.
Those views are different, but not inconsistent. Oracle manages to offer
multiple options and
On Thu, Oct 7, 2010 at 3:01 AM, Markus Wanner mar...@bluegap.ch wrote:
Of course, it doesn't make sense to wait-forever on *every* standby that
ever gets added. Quorum commit is required, yes (and that's what this
thread is about, IIRC). But with quorum commit, adding a standby only
improves
On Thu, 2010-10-07 at 19:44 -0400, Greg Smith wrote:
I don't see this as needing any implementation any more complicated than
the usual way such timeouts are handled. Note how long you've been
trying to reach the standby. Default to -1 for forever. And if you hit
the timeout, mark the
On Fri, Oct 8, 2010 at 8:44 AM, Greg Smith g...@2ndquadrant.com wrote:
Additional code? Yes. Foot-gun? Yes. Timeout should be disabled by
default so that you get wait forever unless you ask for something different?
Probably. Unneeded? This is where we don't agree anymore. The example
On 06.10.2010 01:14, Josh Berkus wrote:
Last I checked, our goal with synch standby was to increase availablity,
not decrease it.
No. Synchronous replication does not help with availability. It allows
you to achieve zero data loss, ie. if the master dies, you are
guaranteed that any
On 06.10.2010 01:14, Josh Berkus wrote:
You start a new one from the latest base backup and let it catch up?
Possibly modifying the config file in the master to let it know about
the new standby, if we go down that path. This part doesn't seem
particularly hard to me.
Agreed, not sure of the
On 10/06/2010 04:31 AM, Simon Riggs wrote:
That situation would require two things
* First, you have set up async replication and you're not monitoring it
properly. Shame on you.
The way I read it, Jeff is complaining about the timeout you propose
that effectively turns sync into async
On Wed, Oct 6, 2010 at 10:52 AM, Jeff Davis pg...@j-davis.com wrote:
I'm not sure I entirely understand. I was concerned about the case of a
standby server being allowed to lag behind the rest by a large number of
WAL records. That can't happen in the wait for all servers to apply
case,
On 10/06/2010 08:31 AM, Heikki Linnakangas wrote:
On 06.10.2010 01:14, Josh Berkus wrote:
Last I checked, our goal with synch standby was to increase availablity,
not decrease it.
No. Synchronous replication does not help with availability. It allows
you to achieve zero data loss, ie. if
1 - 100 of 137 matches
Mail list logo