Re: [HACKERS] After switching primary server while using replication slot.

2014-08-27 Thread Fujii Masao
On Fri, Aug 22, 2014 at 11:29 PM, Andres Freund and...@2ndquadrant.com wrote:
 Hi,

 On 2014-08-20 13:14:30 -0400, Robert Haas wrote:
 On Tue, Aug 19, 2014 at 6:25 AM, Fujii Masao masao.fu...@gmail.com wrote:
  On Mon, Aug 18, 2014 at 11:16 PM, Sawada Masahiko sawada.m...@gmail.com 
  wrote:
  Hi all,
  After switching primary serer while using repliaction slot, the
  standby server will not able to connect new primary server.
  Imagine this situation, if primary server has two ASYNC standby
  servers, also use each replication slots.
  And the one standby(A) apply WAL without problems. But another one
  standby(B) has stopped after connected to primary server.
  (or sending WAL is too delayed)
 
  In this situation, the standby(B) has not received WAL segment file
  while stopping itself.
  And the primary server can not remove WAL segments which has not been
  received to all standby.
  Therefore the primary server have to keep the WAL segment file which
  has not been received to all standby.
  But standby(A) can do checkpoint itself, and then it's possible to
  recycle WAL segments.
  The number of WAL segment of each server are different.
  ( The number of WAL files of standby(A) having smaller than primary 
  server.)
  After the primary server is crashed, the standby(A) promote to primary,
  we can try to connect standby(B) to standby(A) as new standby server.
  But it will be failed because the standby(A) server might not have WAL
  segment files that standby(B) required.
 
  This sounds valid concern.
 
  To resolve this situation, I think that we should make master server
  to notify about removal of WAL segment to all standby servers.
  And the standby servers recycle WAL segments files base on that 
  information.

 I think that'll end up being really horrible, at least if done in an
 obligatory fashion. In a cascaded setup it's really sensible to only
 retain WAL on the intermediate nodes. Consider e.g. a setup - rather
 common these days actually - where there's a master somewhere and then a
 cascading standby on each continent feeding off to further nodes on that
 continent. You don't want to retain nodes on each continent (or on the
 primary) just because one node somewhere is down for maintenance.


 If you really want something like this we should probably add the
 infrastructure for one standby to maintain a replication slot on another
 standby server. So, if you have a setup like:

 A
/ \
  /\
 B  C
/ \ /\
 .... ..  ..

 B and C can coordinate that they keep enough WAL for each other. You can
 actually easily write a external tool for that today. Just create a
 replication slot oin B for C and the other way round and have a tool
 update them once a minute or so.

 I'm not sure if we want that builtin.

  Thought?
 
  How does the server recycle WAL files after it's promoted from the
  standby to master?
  It does that as it likes? If yes, your approach would not be enough.
 
  The approach prevents unexpected removal of WAL files while the standby
  is running. But after the standby is promoted to master, it might recycle
  needed WAL files immediately. So another standby may still fail to retrieve
  the required WAL file after the promotion.
 
  ISTM that, in order to address this, we might need to log all the 
  replication
  slot activities and replicate them to the standby. I'm not sure if this
  breaks the design of replication slot at all, though.

 Yes, that'd break it. You can't WAL log anything on a standby, and
 replication slots can be modified on standbys.

So current solution for the problem Sawada reported is to increase
wal_keep_segments on the standby to enough high maybe.

Regards,

-- 
Fujii Masao


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] After switching primary server while using replication slot.

2014-08-22 Thread Andres Freund
Hi,

On 2014-08-20 13:14:30 -0400, Robert Haas wrote:
 On Tue, Aug 19, 2014 at 6:25 AM, Fujii Masao masao.fu...@gmail.com wrote:
  On Mon, Aug 18, 2014 at 11:16 PM, Sawada Masahiko sawada.m...@gmail.com 
  wrote:
  Hi all,
  After switching primary serer while using repliaction slot, the
  standby server will not able to connect new primary server.
  Imagine this situation, if primary server has two ASYNC standby
  servers, also use each replication slots.
  And the one standby(A) apply WAL without problems. But another one
  standby(B) has stopped after connected to primary server.
  (or sending WAL is too delayed)
 
  In this situation, the standby(B) has not received WAL segment file
  while stopping itself.
  And the primary server can not remove WAL segments which has not been
  received to all standby.
  Therefore the primary server have to keep the WAL segment file which
  has not been received to all standby.
  But standby(A) can do checkpoint itself, and then it's possible to
  recycle WAL segments.
  The number of WAL segment of each server are different.
  ( The number of WAL files of standby(A) having smaller than primary 
  server.)
  After the primary server is crashed, the standby(A) promote to primary,
  we can try to connect standby(B) to standby(A) as new standby server.
  But it will be failed because the standby(A) server might not have WAL
  segment files that standby(B) required.
 
  This sounds valid concern.
 
  To resolve this situation, I think that we should make master server
  to notify about removal of WAL segment to all standby servers.
  And the standby servers recycle WAL segments files base on that 
  information.

I think that'll end up being really horrible, at least if done in an
obligatory fashion. In a cascaded setup it's really sensible to only
retain WAL on the intermediate nodes. Consider e.g. a setup - rather
common these days actually - where there's a master somewhere and then a
cascading standby on each continent feeding off to further nodes on that
continent. You don't want to retain nodes on each continent (or on the
primary) just because one node somewhere is down for maintenance.


If you really want something like this we should probably add the
infrastructure for one standby to maintain a replication slot on another
standby server. So, if you have a setup like:

A
   / \
 /\
B  C
   / \ /\
.... ..  ..

B and C can coordinate that they keep enough WAL for each other. You can
actually easily write a external tool for that today. Just create a
replication slot oin B for C and the other way round and have a tool
update them once a minute or so.

I'm not sure if we want that builtin.

  Thought?
 
  How does the server recycle WAL files after it's promoted from the
  standby to master?
  It does that as it likes? If yes, your approach would not be enough.
 
  The approach prevents unexpected removal of WAL files while the standby
  is running. But after the standby is promoted to master, it might recycle
  needed WAL files immediately. So another standby may still fail to retrieve
  the required WAL file after the promotion.
 
  ISTM that, in order to address this, we might need to log all the 
  replication
  slot activities and replicate them to the standby. I'm not sure if this
  breaks the design of replication slot at all, though.

Yes, that'd break it. You can't WAL log anything on a standby, and
replication slots can be modified on standbys.

 I believe that the reason why replication slots are not currently
 replicated is because we had the idea that the standby could have
 slots that don't exist on the master, for cascading replication.  I'm
 not sure that works yet, but I think Andres definitely had it in mind
 in the original design.

That works. And it's absolutely required for adding logical decoding on
standbys (I've a prototype patch for it...).

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] After switching primary server while using replication slot.

2014-08-20 Thread Robert Haas
On Tue, Aug 19, 2014 at 6:25 AM, Fujii Masao masao.fu...@gmail.com wrote:
 On Mon, Aug 18, 2014 at 11:16 PM, Sawada Masahiko sawada.m...@gmail.com 
 wrote:
 Hi all,
 After switching primary serer while using repliaction slot, the
 standby server will not able to connect new primary server.
 Imagine this situation, if primary server has two ASYNC standby
 servers, also use each replication slots.
 And the one standby(A) apply WAL without problems. But another one
 standby(B) has stopped after connected to primary server.
 (or sending WAL is too delayed)

 In this situation, the standby(B) has not received WAL segment file
 while stopping itself.
 And the primary server can not remove WAL segments which has not been
 received to all standby.
 Therefore the primary server have to keep the WAL segment file which
 has not been received to all standby.
 But standby(A) can do checkpoint itself, and then it's possible to
 recycle WAL segments.
 The number of WAL segment of each server are different.
 ( The number of WAL files of standby(A) having smaller than primary server.)
 After the primary server is crashed, the standby(A) promote to primary,
 we can try to connect standby(B) to standby(A) as new standby server.
 But it will be failed because the standby(A) server might not have WAL
 segment files that standby(B) required.

 This sounds valid concern.

 To resolve this situation, I think that we should make master server
 to notify about removal of WAL segment to all standby servers.
 And the standby servers recycle WAL segments files base on that information.

 Thought?

 How does the server recycle WAL files after it's promoted from the
 standby to master?
 It does that as it likes? If yes, your approach would not be enough.

 The approach prevents unexpected removal of WAL files while the standby
 is running. But after the standby is promoted to master, it might recycle
 needed WAL files immediately. So another standby may still fail to retrieve
 the required WAL file after the promotion.

 ISTM that, in order to address this, we might need to log all the replication
 slot activities and replicate them to the standby. I'm not sure if this
 breaks the design of replication slot at all, though.

Yuck.

I believe that the reason why replication slots are not currently
replicated is because we had the idea that the standby could have
slots that don't exist on the master, for cascading replication.  I'm
not sure that works yet, but I think Andres definitely had it in mind
in the original design.

It seems to me that if every machine needs to keep not only the WAL it
requires for itself, but also the WAL that any of other machine in the
replication hierarchy might need, that's pretty much sucks.  Suppose
you have a master with 10 standbys, and each standby has 10 cascaded
standbys.  If one of those standbys goes down, do we really want all
100 other machines to keep copies of all the WAL?  That seems rather
unfortunate, since it's likely that only a few of those many standbys
are machines to which we would consider failing over.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] After switching primary server while using replication slot.

2014-08-19 Thread Fujii Masao
On Mon, Aug 18, 2014 at 11:16 PM, Sawada Masahiko sawada.m...@gmail.com wrote:
 Hi all,

 After switching primary serer while using repliaction slot, the
 standby server will not able to connect new primary server.
 Imagine this situation, if primary server has two ASYNC standby
 servers, also use each replication slots.
 And the one standby(A) apply WAL without problems. But another one
 standby(B) has stopped after connected to primary server.
 (or sending WAL is too delayed)

 In this situation, the standby(B) has not received WAL segment file
 while stopping itself.
 And the primary server can not remove WAL segments which has not been
 received to all standby.
 Therefore the primary server have to keep the WAL segment file which
 has not been received to all standby.
 But standby(A) can do checkpoint itself, and then it's possible to
 recycle WAL segments.
 The number of WAL segment of each server are different.
 ( The number of WAL files of standby(A) having smaller than primary server.)
 After the primary server is crashed, the standby(A) promote to primary,
 we can try to connect standby(B) to standby(A) as new standby server.
 But it will be failed because the standby(A) server might not have WAL
 segment files that standby(B) required.

This sounds valid concern.

 To resolve this situation, I think that we should make master server
 to notify about removal of WAL segment to all standby servers.
 And the standby servers recycle WAL segments files base on that information.

 Thought?

How does the server recycle WAL files after it's promoted from the
standby to master?
It does that as it likes? If yes, your approach would not be enough.

The approach prevents unexpected removal of WAL files while the standby
is running. But after the standby is promoted to master, it might recycle
needed WAL files immediately. So another standby may still fail to retrieve
the required WAL file after the promotion.

ISTM that, in order to address this, we might need to log all the replication
slot activities and replicate them to the standby. I'm not sure if this
breaks the design of replication slot at all, though.

Regards,

-- 
Fujii Masao


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] After switching primary server while using replication slot.

2014-08-18 Thread Sawada Masahiko
Hi all,

After switching primary serer while using repliaction slot, the
standby server will not able to connect new primary server.
Imagine this situation, if primary server has two ASYNC standby
servers, also use each replication slots.
And the one standby(A) apply WAL without problems. But another one
standby(B) has stopped after connected to primary server.
(or sending WAL is too delayed)

In this situation, the standby(B) has not received WAL segment file
while stopping itself.
And the primary server can not remove WAL segments which has not been
received to all standby.
Therefore the primary server have to keep the WAL segment file which
has not been received to all standby.
But standby(A) can do checkpoint itself, and then it's possible to
recycle WAL segments.
The number of WAL segment of each server are different.
( The number of WAL files of standby(A) having smaller than primary server.)
After the primary server is crashed, the standby(A) promote to primary,
we can try to connect standby(B) to standby(A) as new standby server.
But it will be failed because the standby(A) server might not have WAL
segment files that standby(B) required.

To resolve this situation, I think that we should make master server
to notify about removal of WAL segment to all standby servers.
And the standby servers recycle WAL segments files base on that information.

Thought?

-- 
Regards,

---
Sawada Masahiko


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers