Interesting discussion!

>>> Basically I like this whole idea, but I'd like to know why do you think 
>>> this functionality is required?

>> How should a synchronous master handle the situation where all
>> standbys have failed ?
>>
>> Well, I think this is one of those cases where you could argue either
>> way. Someone caring more about high availability of the system will
>> want to let the master continue and just raise an alert to the
>> operators. Someone looking for an absolute guarantee of data
>> replication will say otherwise.

>If you don't care about the absolute guarantee of data, why not just
>use async replication? It's still going to replicate the data over to
>the client as quickly as it can - which in the end is the same level
>of guarantee that you get with this switch set, isn't it?

This setup does still guarantee that if the master fails, then you can
still fail over to the standby without any possible data loss because
all data is synchronously replicated.

I want to replicate data with synchronous guarantee to a disaster site
*when possible*. If there is any chance that commits can be
replicated, then I’d like to wait for that.

If however the disaster node/site/link just plain fails and
replication goes down for an *indefinite* amount of time, then I want
the primary node to continue operating, raise an alert and deal with
that. Rather than have the whole system grind to a halt just because a
standby node failed.

It’s not so much that I don’t “care” about replication guarantee, then
I’d just use asynchronous and be done with it. My point is that it is
not always black and white and for some system setups you have to
balance a few things against each other.

If we were just talking about network glitches then I would be fine
with the current behavior because I do not believe they are
long-lasting anyway and they are also *quantifiable* which is a huge
bonus.

My primary focus is system availability but I also care about all that
other stuff too.

I want to have the cake and eat it at the same time as we say in Sweden ;)

>>> When is the replication mode switched from "standalone" to "sync"?
>>
>> Good question. Currently that happens when a standby server has
>> connected and also been deemed suitable for synchronous commit by the
>> master ( meaning that its name matches the config variable
>> synchronous_standby_names ). So in a setup with both synchronous and
>> asynchronous standbys, the master only considers the synchronous ones
>> when deciding on standalone mode. The asynchronous standbys are
>> “useless” to a synchronous master anyway.

>But wouldn't an async standby still be a lot better than no standby at
>all (standalone)?

As soon as the standby comes back online, I want to wait for it to sync.

>>> The former might block the transactions for a long time until the standby 
>>> has caught up with the master even though synchronous_standalone_master is 
>>> enabled and a user wants to avoid such a downtime.
>
>> If we a talking about a network “glitch”, than the standby would take
>> a few seconds/minutes to catch up (not hours!) which is acceptable if
>> you ask me.

>So it's not Ok to block the master when the standby goes away, but it
>is ok to block it when it comes back and catches up? The goes away
>might be the same amount of time - or even shorter, depending on
>exactly how the network works..

To be honest I don’t have a very strong opinion here, we could go
either way, I just wanted to keep this patch as small as possible to
begin with. But again network glitches aren’t my primary concern in a
HA system because the amount of data that the standby lags behind is
possible to estimate and plan for.

Typically switch convergence takes in the order of 15-30 seconds and I
can thus typically assume that the restarted standby can recover that
gap in less than a minute. So once upon a blue moon when something
like that happens, commits would take up to say 1 minute longer. No
big deal IMHO.

>>> 1. While synchronous replication is running normally, replication
>>> connection is closed because of
>>>    network outage.
>>> 2. The master works standalone because of
>>> synchronous_standalone_master=on and some
>>>    new transactions are committed though their WAL records are not
>>> replicated to the standby.
>>> 3. The master crashes for some reasons, the clusterware detects it and
>>> triggers a failover.
>>> 4. The standby which doesn't have recent committed transactions
>>> becomes the master at a failover...

>>> Is this scenario acceptable?

>> So you have two separate failures in less time than an admin would
>> have time to react and manually bring up a new standby.

>Given that one is a network failure, and one is a node failure, I
>don't see that being strange at all. For example, a HA network
>environment might cause a short glitch when it's failing over to a
>redundant node - enough to bring down the replication connection and
>require it to reconnect (during which the master would be ahead of the
>slave).

>In fact, both might well be network failures - one just making the
>master completely inaccessble, and thus triggering the need for a
>failover.

You still have two failures on a two-node system.

If we are talking about a setup with only two nodes (which I am), then
I think it’s fair to limit the discussion to one failure (whatever
that might be! node,switch,disk,site,intra-site link, power, etc ...).

And in that case, there are only really three likely scenarios :
1)      The master fails
2)      The standby fails
3)      Both fail (due to shared network gear, power, etc)

Yes there might be a need to failover and Yes the standby could
possibly have lagged behind the master but with my sync+standalone
mode, you reduce the risk of that compared to just async mode.

So decrease the risk of data loss (case 1), increase system
availability/uptime (case 2).

That is a actually a pretty good description of my goal here :)

Cheers,

/A

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to