Re: [HACKERS] Clean switchover

2013-06-25 Thread Fujii Masao
On Mon, Jun 24, 2013 at 3:41 PM, Andres Freund and...@2ndquadrant.com wrote:
 On 2013-06-14 04:56:15 +0900, Fujii Masao wrote:
 On Wed, Jun 12, 2013 at 9:48 PM, Stephen Frost sfr...@snowman.net wrote:
  * Magnus Hagander (mag...@hagander.net) wrote:
  On Wed, Jun 12, 2013 at 1:48 PM, Andres Freund and...@2ndquadrant.com 
  wrote:
   On 2013-06-12 07:53:29 +0900, Fujii Masao wrote:
   The attached patch fixes this problem. It just changes walsender so 
   that it
   waits for all the outstanding WAL records to be replicated to the 
   standby
   before closing the replication connection.
  
   Imo this is a fix that needs to get backpatched... The code tried to do
   this but failed, I don't think it really gives grounds for valid *new*
   concerns.
 
  +1 (without having looked at the code itself, it's definitely a
  behaviour that needs to be fixed)
 
  Yea, I was also thinking it would be reasonable to backpatch this; it
  really looks like a bug that we're allowing this to happen today.
 
  So, +1 on a backpatch for me.

 +1. I think that we can backpatch to 9.1, 9.2 and 9.3.

 I marked the patch as ready for committer.

Committed. Thanks a lot!

Regards,

-- 
Fujii Masao


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Clean switchover

2013-06-24 Thread Andres Freund
On 2013-06-14 04:56:15 +0900, Fujii Masao wrote:
 On Wed, Jun 12, 2013 at 9:48 PM, Stephen Frost sfr...@snowman.net wrote:
  * Magnus Hagander (mag...@hagander.net) wrote:
  On Wed, Jun 12, 2013 at 1:48 PM, Andres Freund and...@2ndquadrant.com 
  wrote:
   On 2013-06-12 07:53:29 +0900, Fujii Masao wrote:
   The attached patch fixes this problem. It just changes walsender so 
   that it
   waits for all the outstanding WAL records to be replicated to the 
   standby
   before closing the replication connection.
  
   Imo this is a fix that needs to get backpatched... The code tried to do
   this but failed, I don't think it really gives grounds for valid *new*
   concerns.
 
  +1 (without having looked at the code itself, it's definitely a
  behaviour that needs to be fixed)
 
  Yea, I was also thinking it would be reasonable to backpatch this; it
  really looks like a bug that we're allowing this to happen today.
 
  So, +1 on a backpatch for me.
 
 +1. I think that we can backpatch to 9.1, 9.2 and 9.3.

I marked the patch as ready for committer.

 In 9.0, the standby doesn't send back any message to the master and
 there is no way to know whether replication has been done up to
 the specified location, so I don't think that we can backpatch.

Agreed. 9.0 doesn't have enough infrastructure for this.

 One note is, even if we backpatch, controlled switchover may require
 the backup in order to follow the timeline switch, in 9.1 and 9.2.
 If we want to avoid the backup in that case, we need to set up
 the shared archive area between the master and the standby and
 set recovery_target_timeline to 'latest'.

Fixing this seems outside the scope of this patch... - and rather
unlikely to be backpatchable.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Clean switchover

2013-06-13 Thread Fujii Masao
On Wed, Jun 12, 2013 at 12:55 PM, Mark Kirkwood
mark.kirkw...@catalyst.net.nz wrote:
 On 12/06/13 13:15, Stephen Frost wrote:

 * Fujii Masao (masao.fu...@gmail.com) wrote:

 The attached patch fixes this problem. It just changes walsender so that
 it
 waits for all the outstanding WAL records to be replicated to the standby
 before closing the replication connection.


 Seems like a good idea to me..  Rather surprised that we're not doing
 this already, to be honest.


 Yeah +1 from here too. This would make clean switchovers for (typically)
 testing scenarios a lot less complex and resource intensive (rebuilding of
 the old master as a slave when you know it is ok is despairing on a huge
 db).

 On the related note (but not actually to do with this patch),
 clarifying/expanding the docs about the various methods for standby
 promotion:

 1/ trigger file creation
 2/ pg_ctl promote
 3/ renaming/removing recovery.conf

 and the differences between them would be great. For instance I only
 recently realized that method 3) means the promoted standby does not start a
 new timeline (incidentally - could this be an option to pg_ctl promote)
 which is very useful for (again) controlled/clean switchovers.

In 9.3, you no longer need to worry about the increment of timeline
after the promotion because the standby can automatically follow
the timeline switch.

Regards,

-- 
Fujii Masao


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Clean switchover

2013-06-13 Thread Fujii Masao
On Wed, Jun 12, 2013 at 9:55 PM, Andres Freund and...@2ndquadrant.com wrote:
 On 2013-06-12 08:48:39 -0400, Stephen Frost wrote:
 * Magnus Hagander (mag...@hagander.net) wrote:
  On Wed, Jun 12, 2013 at 1:48 PM, Andres Freund and...@2ndquadrant.com 
  wrote:
   On 2013-06-12 07:53:29 +0900, Fujii Masao wrote:
   The attached patch fixes this problem. It just changes walsender so 
   that it
   waits for all the outstanding WAL records to be replicated to the 
   standby
   before closing the replication connection.
  
   Imo this is a fix that needs to get backpatched... The code tried to do
   this but failed, I don't think it really gives grounds for valid *new*
   concerns.
 
  +1 (without having looked at the code itself, it's definitely a
  behaviour that needs to be fixed)

 Yea, I was also thinking it would be reasonable to backpatch this; it
 really looks like a bug that we're allowing this to happen today.

 So, +1 on a backpatch for me.  I've looked at the patch (it's a
 one-liner, plus some additional comments) but havn't looked through the
 overall code surrounding it.

 I've read most of the surrounding code and I think the patch is as
 sensible as it can be without reworking the whole walsender main loop
 which seems like a job for another day.

 I'd personally write
   if (caughtup  !pq_is_send_pending() 
   sentPtr == MyWalSnd-flush)
 as
   if (caughtup  sentPtr == MyWalSnd-flush 
   !pq_is_send_pending())

 Since pq_is_send_pending() basically can only be false if the flush
 comparison is true. There's the tiny chance that we were sending a
 message out just before which is why we should include the
 !pq_is_send_pending() condition at all in that if().

Yep, I updated the patch that way. Thanks for the comment!

Regards,

-- 
Fujii Masao


switchover_v2.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Clean switchover

2013-06-13 Thread Fujii Masao
On Wed, Jun 12, 2013 at 9:48 PM, Stephen Frost sfr...@snowman.net wrote:
 * Magnus Hagander (mag...@hagander.net) wrote:
 On Wed, Jun 12, 2013 at 1:48 PM, Andres Freund and...@2ndquadrant.com 
 wrote:
  On 2013-06-12 07:53:29 +0900, Fujii Masao wrote:
  The attached patch fixes this problem. It just changes walsender so that 
  it
  waits for all the outstanding WAL records to be replicated to the standby
  before closing the replication connection.
 
  Imo this is a fix that needs to get backpatched... The code tried to do
  this but failed, I don't think it really gives grounds for valid *new*
  concerns.

 +1 (without having looked at the code itself, it's definitely a
 behaviour that needs to be fixed)

 Yea, I was also thinking it would be reasonable to backpatch this; it
 really looks like a bug that we're allowing this to happen today.

 So, +1 on a backpatch for me.

+1. I think that we can backpatch to 9.1, 9.2 and 9.3.

In 9.0, the standby doesn't send back any message to the master and
there is no way to know whether replication has been done up to
the specified location, so I don't think that we can backpatch.

One note is, even if we backpatch, controlled switchover may require
the backup in order to follow the timeline switch, in 9.1 and 9.2.
If we want to avoid the backup in that case, we need to set up
the shared archive area between the master and the standby and
set recovery_target_timeline to 'latest'.

Regards,

-- 
Fujii Masao


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Clean switchover

2013-06-13 Thread Mark Kirkwood

On 14/06/13 07:38, Fujii Masao wrote:

On Wed, Jun 12, 2013 at 12:55 PM, Mark Kirkwood
mark.kirkw...@catalyst.net.nz wrote:

On 12/06/13 13:15, Stephen Frost wrote:


* Fujii Masao (masao.fu...@gmail.com) wrote:


The attached patch fixes this problem. It just changes walsender so that
it
waits for all the outstanding WAL records to be replicated to the standby
before closing the replication connection.



Seems like a good idea to me..  Rather surprised that we're not doing
this already, to be honest.



Yeah +1 from here too. This would make clean switchovers for (typically)
testing scenarios a lot less complex and resource intensive (rebuilding of
the old master as a slave when you know it is ok is despairing on a huge
db).

On the related note (but not actually to do with this patch),
clarifying/expanding the docs about the various methods for standby
promotion:

1/ trigger file creation
2/ pg_ctl promote
3/ renaming/removing recovery.conf

and the differences between them would be great. For instance I only
recently realized that method 3) means the promoted standby does not start a
new timeline (incidentally - could this be an option to pg_ctl promote)
which is very useful for (again) controlled/clean switchovers.


In 9.3, you no longer need to worry about the increment of timeline
after the promotion because the standby can automatically follow
the timeline switch.

Regards,



Yes - and that will be awesome. Nice work!

However for those systems still on 9.1/9.2for the time being, some 
clarification of the details/differences uisg promotion via the 1) - 3) 
would be useful I think.


Note I'm not demanding that you do it - I just happened to be thinking 
about this when I saw your patch, so though I'd mention it (I've seen 
some brief discussion about this before, but nothing concrete came out 
of that in terms of documentation additions).


If folks are keen, I'm willing to attempt to find a suitable place in 
the docs to add a discussion about 1) - 3) above. However I was kinda 
hoping that someone who knows all the details would... fro instance I've 
skimmed the code and note that the system knows when it is promoted 
via trigger file vs pg_ctl... but as to what if any differences that 
makes to the resulting actions (other than removing the trigger file) - 
I am not clear on.


Cheers

Mark


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Clean switchover

2013-06-12 Thread Magnus Hagander
On Wed, Jun 12, 2013 at 6:41 AM, Amit Kapila amit.kap...@huawei.com wrote:
 On Wednesday, June 12, 2013 4:23 AM Fujii Masao wrote:
 Hi,

 In streaming replication, when we shutdown the master, walsender tries
 to send all the outstanding WAL records including the shutdown
 checkpoint record to the standby, and then to exit. This basically
 means that all the WAL records are fully synced between two servers
 after the clean shutdown of the master. So, after promoting the standby
 to new master, we can restart the stopped master as new standby without
 the need for a fresh backup from new master.

 But there is one problem: though walsender tries to send all the
 outstanding WAL records, it doesn't wait for them to be replicated to
 the standby. IOW, walsender closes the replication connection as soon
 as it sends WAL records.
 Then, before receiving all the WAL records, walreceiver can detect the
 closure of connection and exit. We cannot guarantee that there is no
 missing WAL in the standby after clean shutdown of the master. In this
 case, backup from new master is required when restarting the stopped
 master as new standby. I have experienced this case several times,
 especially when enabling WAL archiving.

 The attached patch fixes this problem. It just changes walsender so
 that it waits for all the outstanding WAL records to be replicated to
 the standby before closing the replication connection.

 You may be concerned the case where the standby gets stuck and the
 walsender keeps waiting for the reply from that standby. In this case,
 wal_sender_timeout detects such inactive standby and then walsender
 ends. So even in that case, the shutdown can end.

 Do you think it can impact time to complete shutdown?
 After completing shutdown, user will promote standby to master, so if there
 is delay in shutdown, it can cause delay in switchover.

I'd expect a controlled switchover to happen without dataloss. Yes,
this could make it take a bit longer time, but it guarantees you don't
loose data. ISTM that if you don't care about the potential dataloss,
you can just use a faster shutdown method (e.g. immediate)

--
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Clean switchover

2013-06-12 Thread Andres Freund
Hi,

On 2013-06-12 07:53:29 +0900, Fujii Masao wrote:
 The attached patch fixes this problem. It just changes walsender so that it
 waits for all the outstanding WAL records to be replicated to the standby
 before closing the replication connection.

Imo this is a fix that needs to get backpatched... The code tried to do
this but failed, I don't think it really gives grounds for valid *new*
concerns.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Clean switchover

2013-06-12 Thread Magnus Hagander
On Wed, Jun 12, 2013 at 1:48 PM, Andres Freund and...@2ndquadrant.com wrote:
 Hi,

 On 2013-06-12 07:53:29 +0900, Fujii Masao wrote:
 The attached patch fixes this problem. It just changes walsender so that it
 waits for all the outstanding WAL records to be replicated to the standby
 before closing the replication connection.

 Imo this is a fix that needs to get backpatched... The code tried to do
 this but failed, I don't think it really gives grounds for valid *new*
 concerns.

+1 (without having looked at the code itself, it's definitely a
behaviour that needs to be fixed)

--
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Clean switchover

2013-06-12 Thread Stephen Frost
* Magnus Hagander (mag...@hagander.net) wrote:
 On Wed, Jun 12, 2013 at 1:48 PM, Andres Freund and...@2ndquadrant.com wrote:
  On 2013-06-12 07:53:29 +0900, Fujii Masao wrote:
  The attached patch fixes this problem. It just changes walsender so that it
  waits for all the outstanding WAL records to be replicated to the standby
  before closing the replication connection.
 
  Imo this is a fix that needs to get backpatched... The code tried to do
  this but failed, I don't think it really gives grounds for valid *new*
  concerns.
 
 +1 (without having looked at the code itself, it's definitely a
 behaviour that needs to be fixed)

Yea, I was also thinking it would be reasonable to backpatch this; it
really looks like a bug that we're allowing this to happen today.

So, +1 on a backpatch for me.  I've looked at the patch (it's a
one-liner, plus some additional comments) but havn't looked through the
overall code surrounding it.

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] Clean switchover

2013-06-12 Thread Andres Freund
On 2013-06-12 08:48:39 -0400, Stephen Frost wrote:
 * Magnus Hagander (mag...@hagander.net) wrote:
  On Wed, Jun 12, 2013 at 1:48 PM, Andres Freund and...@2ndquadrant.com 
  wrote:
   On 2013-06-12 07:53:29 +0900, Fujii Masao wrote:
   The attached patch fixes this problem. It just changes walsender so that 
   it
   waits for all the outstanding WAL records to be replicated to the standby
   before closing the replication connection.
  
   Imo this is a fix that needs to get backpatched... The code tried to do
   this but failed, I don't think it really gives grounds for valid *new*
   concerns.
  
  +1 (without having looked at the code itself, it's definitely a
  behaviour that needs to be fixed)
 
 Yea, I was also thinking it would be reasonable to backpatch this; it
 really looks like a bug that we're allowing this to happen today.
 
 So, +1 on a backpatch for me.  I've looked at the patch (it's a
 one-liner, plus some additional comments) but havn't looked through the
 overall code surrounding it.

I've read most of the surrounding code and I think the patch is as
sensible as it can be without reworking the whole walsender main loop
which seems like a job for another day.

I'd personally write
  if (caughtup  !pq_is_send_pending() 
  sentPtr == MyWalSnd-flush)
as
  if (caughtup  sentPtr == MyWalSnd-flush 
  !pq_is_send_pending())

Since pq_is_send_pending() basically can only be false if the flush
comparison is true. There's the tiny chance that we were sending a
message out just before which is why we should include the
!pq_is_send_pending() condition at all in that if().

But that's fairly, fairly minor.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Clean switchover

2013-06-11 Thread Stephen Frost
* Fujii Masao (masao.fu...@gmail.com) wrote:
 The attached patch fixes this problem. It just changes walsender so that it
 waits for all the outstanding WAL records to be replicated to the standby
 before closing the replication connection.

Seems like a good idea to me..  Rather surprised that we're not doing
this already, to be honest.

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] Clean switchover

2013-06-11 Thread Mark Kirkwood

On 12/06/13 13:15, Stephen Frost wrote:

* Fujii Masao (masao.fu...@gmail.com) wrote:

The attached patch fixes this problem. It just changes walsender so that it
waits for all the outstanding WAL records to be replicated to the standby
before closing the replication connection.


Seems like a good idea to me..  Rather surprised that we're not doing
this already, to be honest.



Yeah +1 from here too. This would make clean switchovers for (typically) 
testing scenarios a lot less complex and resource intensive (rebuilding 
of the old master as a slave when you know it is ok is despairing on a 
huge db).


On the related note (but not actually to do with this patch), 
clarifying/expanding the docs about the various methods for standby 
promotion:


1/ trigger file creation
2/ pg_ctl promote
3/ renaming/removing recovery.conf

and the differences between them would be great. For instance I only 
recently realized that method 3) means the promoted standby does not 
start a new timeline (incidentally - could this be an option to pg_ctl 
promote) which is very useful for (again) controlled/clean switchovers.


regards

Mark




--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Clean switchover

2013-06-11 Thread Amit Kapila
On Wednesday, June 12, 2013 4:23 AM Fujii Masao wrote:
 Hi,
 
 In streaming replication, when we shutdown the master, walsender tries
 to send all the outstanding WAL records including the shutdown
 checkpoint record to the standby, and then to exit. This basically
 means that all the WAL records are fully synced between two servers
 after the clean shutdown of the master. So, after promoting the standby
 to new master, we can restart the stopped master as new standby without
 the need for a fresh backup from new master.
 
 But there is one problem: though walsender tries to send all the
 outstanding WAL records, it doesn't wait for them to be replicated to
 the standby. IOW, walsender closes the replication connection as soon
 as it sends WAL records.
 Then, before receiving all the WAL records, walreceiver can detect the
 closure of connection and exit. We cannot guarantee that there is no
 missing WAL in the standby after clean shutdown of the master. In this
 case, backup from new master is required when restarting the stopped
 master as new standby. I have experienced this case several times,
 especially when enabling WAL archiving.
 
 The attached patch fixes this problem. It just changes walsender so
 that it waits for all the outstanding WAL records to be replicated to
 the standby before closing the replication connection.

 You may be concerned the case where the standby gets stuck and the
 walsender keeps waiting for the reply from that standby. In this case,
 wal_sender_timeout detects such inactive standby and then walsender
 ends. So even in that case, the shutdown can end.

Do you think it can impact time to complete shutdown?
After completing shutdown, user will promote standby to master, so if there
is delay in shutdown, it can cause delay in switchover.


With Regards,
Amit Kapila.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers