Re: [HACKERS] Time-Delayed Standbys

2013-12-14 Thread Andres Freund
On 2013-12-13 13:44:30 +, Simon Riggs wrote:
 On 13 December 2013 13:22, Andres Freund and...@2ndquadrant.com wrote:
  On 2013-12-13 13:09:13 +, Simon Riggs wrote:
  On 13 December 2013 11:58, Andres Freund and...@2ndquadrant.com wrote:
   I removed it because it was after the pause. I'll replace it, but
   before the pause.
  
   Doesn't after the pause make more sense? If somebody promoted while we
   were waiting, we want to recognize that before rolling forward? The wait
   can take a long while after all?
 
  That would change the way pause currently works, which is OOS for that 
  patch.
 
  But this feature isn't pause itself - it's imo something
  independent. Note that we currently
  a) check pause again after recoveryApplyDelay(),
  b) do check for promotion if the sleep in recoveryApplyDelay() is
 interrupted. So not checking after the final sleep seems confusing.
 
 I'm proposing the attached patch.

LOoks good, although I'd move it down below the comment ;)

 This patch implements a consistent view of recovery pause, which is
 that when paused, we don't check for promotion, during or immediately
 after. That is user noticeable behaviour and shouldn't be changed
 without thought and discussion on a separate thread with a clear
 descriptive title. (I might argue in favour of it myself, I'm not yet
 decided).

Some more improvements in that are certainly would be good...

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Time-Delayed Standbys

2013-12-13 Thread Simon Riggs
On 12 December 2013 21:58, Fabrízio de Royes Mello
fabriziome...@gmail.com wrote:

 On Thu, Dec 12, 2013 at 3:42 PM, Fabrízio de Royes Mello
 fabriziome...@gmail.com wrote:

 On Thu, Dec 12, 2013 at 3:39 PM, Simon Riggs si...@2ndquadrant.com
 wrote:
 
  On 12 December 2013 15:19, Simon Riggs si...@2ndquadrant.com wrote:
 
   Don't panic guys! I meant UTC offset only. And yes, it may not be
   needed, will check.
 
  Checked, all non-UTC TZ offsets work without further effort here.
 

 Thanks!


 Reviewing the committed patch I noted that the CheckForStandbyTrigger()
 after the delay was removed.

 If we promote the standby during the delay and don't check the trigger
 immediately after the delay, then we will replay undesired WALs records.

 The attached patch add this check.

I removed it because it was after the pause. I'll replace it, but
before the pause.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Time-Delayed Standbys

2013-12-13 Thread Andres Freund
On 2013-12-13 11:56:47 +, Simon Riggs wrote:
 On 12 December 2013 21:58, Fabrízio de Royes Mello
 fabriziome...@gmail.com wrote:
  Reviewing the committed patch I noted that the CheckForStandbyTrigger()
  after the delay was removed.
 
  If we promote the standby during the delay and don't check the trigger
  immediately after the delay, then we will replay undesired WALs records.
 
  The attached patch add this check.
 
 I removed it because it was after the pause. I'll replace it, but
 before the pause.

Doesn't after the pause make more sense? If somebody promoted while we
were waiting, we want to recognize that before rolling forward? The wait
can take a long while after all?

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Time-Delayed Standbys

2013-12-13 Thread Simon Riggs
On 13 December 2013 11:58, Andres Freund and...@2ndquadrant.com wrote:
 On 2013-12-13 11:56:47 +, Simon Riggs wrote:
 On 12 December 2013 21:58, Fabrízio de Royes Mello
 fabriziome...@gmail.com wrote:
  Reviewing the committed patch I noted that the CheckForStandbyTrigger()
  after the delay was removed.
 
  If we promote the standby during the delay and don't check the trigger
  immediately after the delay, then we will replay undesired WALs records.
 
  The attached patch add this check.

 I removed it because it was after the pause. I'll replace it, but
 before the pause.

 Doesn't after the pause make more sense? If somebody promoted while we
 were waiting, we want to recognize that before rolling forward? The wait
 can take a long while after all?

That would change the way pause currently works, which is OOS for that patch.

I'm happy to discuss such a change, but if agreed, it would need to
apply in all cases, not just this one.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Time-Delayed Standbys

2013-12-13 Thread Andres Freund
On 2013-12-13 13:09:13 +, Simon Riggs wrote:
 On 13 December 2013 11:58, Andres Freund and...@2ndquadrant.com wrote:
  On 2013-12-13 11:56:47 +, Simon Riggs wrote:
  On 12 December 2013 21:58, Fabrízio de Royes Mello
  fabriziome...@gmail.com wrote:
   Reviewing the committed patch I noted that the CheckForStandbyTrigger()
   after the delay was removed.
  
   If we promote the standby during the delay and don't check the trigger
   immediately after the delay, then we will replay undesired WALs records.
  
   The attached patch add this check.
 
  I removed it because it was after the pause. I'll replace it, but
  before the pause.
 
  Doesn't after the pause make more sense? If somebody promoted while we
  were waiting, we want to recognize that before rolling forward? The wait
  can take a long while after all?
 
 That would change the way pause currently works, which is OOS for that patch.

But this feature isn't pause itself - it's imo something
independent. Note that we currently
a) check pause again after recoveryApplyDelay(),
b) do check for promotion if the sleep in recoveryApplyDelay() is
   interrupted. So not checking after the final sleep seems confusing.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Time-Delayed Standbys

2013-12-13 Thread Simon Riggs
On 13 December 2013 13:22, Andres Freund and...@2ndquadrant.com wrote:
 On 2013-12-13 13:09:13 +, Simon Riggs wrote:
 On 13 December 2013 11:58, Andres Freund and...@2ndquadrant.com wrote:
  On 2013-12-13 11:56:47 +, Simon Riggs wrote:
  On 12 December 2013 21:58, Fabrízio de Royes Mello
  fabriziome...@gmail.com wrote:
   Reviewing the committed patch I noted that the 
   CheckForStandbyTrigger()
   after the delay was removed.
  
   If we promote the standby during the delay and don't check the trigger
   immediately after the delay, then we will replay undesired WALs records.
  
   The attached patch add this check.
 
  I removed it because it was after the pause. I'll replace it, but
  before the pause.
 
  Doesn't after the pause make more sense? If somebody promoted while we
  were waiting, we want to recognize that before rolling forward? The wait
  can take a long while after all?

 That would change the way pause currently works, which is OOS for that patch.

 But this feature isn't pause itself - it's imo something
 independent. Note that we currently
 a) check pause again after recoveryApplyDelay(),
 b) do check for promotion if the sleep in recoveryApplyDelay() is
interrupted. So not checking after the final sleep seems confusing.

I'm proposing the attached patch.

This patch implements a consistent view of recovery pause, which is
that when paused, we don't check for promotion, during or immediately
after. That is user noticeable behaviour and shouldn't be changed
without thought and discussion on a separate thread with a clear
descriptive title. (I might argue in favour of it myself, I'm not yet
decided).

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


snippet.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Time-Delayed Standbys

2013-12-13 Thread Fabrízio de Royes Mello
On Fri, Dec 13, 2013 at 11:44 AM, Simon Riggs si...@2ndquadrant.com wrote:

 On 13 December 2013 13:22, Andres Freund and...@2ndquadrant.com wrote:
  On 2013-12-13 13:09:13 +, Simon Riggs wrote:
  On 13 December 2013 11:58, Andres Freund and...@2ndquadrant.com
wrote:
   On 2013-12-13 11:56:47 +, Simon Riggs wrote:
   On 12 December 2013 21:58, Fabrízio de Royes Mello
   fabriziome...@gmail.com wrote:
Reviewing the committed patch I noted that the
CheckForStandbyTrigger()
after the delay was removed.
   
If we promote the standby during the delay and don't check the
trigger
immediately after the delay, then we will replay undesired WALs
records.
   
The attached patch add this check.
  
   I removed it because it was after the pause. I'll replace it, but
   before the pause.
  
   Doesn't after the pause make more sense? If somebody promoted while
we
   were waiting, we want to recognize that before rolling forward? The
wait
   can take a long while after all?
 
  That would change the way pause currently works, which is OOS for that
patch.
 
  But this feature isn't pause itself - it's imo something
  independent. Note that we currently
  a) check pause again after recoveryApplyDelay(),
  b) do check for promotion if the sleep in recoveryApplyDelay() is
 interrupted. So not checking after the final sleep seems confusing.

 I'm proposing the attached patch.

 This patch implements a consistent view of recovery pause, which is
 that when paused, we don't check for promotion, during or immediately
 after. That is user noticeable behaviour and shouldn't be changed
 without thought and discussion on a separate thread with a clear
 descriptive title. (I might argue in favour of it myself, I'm not yet
 decided).


In my previous message [1] I attach a patch equal to your ;-)

Regards,

[1]
http://www.postgresql.org/message-id/CAFcNs+qD0AJ=qzhsHD9+v_Mhz0RTBJ=cJPCT_T=ut_jvvnc...@mail.gmail.com

--
Fabrízio de Royes Mello
Consultoria/Coaching PostgreSQL
 Timbira: http://www.timbira.com.br
 Blog sobre TI: http://fabriziomello.blogspot.com
 Perfil Linkedin: http://br.linkedin.com/in/fabriziomello
 Twitter: http://twitter.com/fabriziomello


Re: [HACKERS] Time-Delayed Standbys

2013-12-12 Thread KONDO Mitsumasa

(2013/12/12 7:23), Fabrízio de Royes Mello wrote:

On Wed, Dec 11, 2013 at 7:47 PM, Andres Freund and...@2ndquadrant.com
  * hot_standby=off: Makes delay useable with wal_level=archive (and thus
a lower WAL volume)
  * standby_mode=off: Configurations that use tools like pg_standby and
similar simply don't need standby_mode=on. If you want to trigger
failover from within the restore_command you *cannot* set it.
  * recovery_target_*: It can still make sense if you use
pause_at_recovery_target.


I don't think part of his arguments are right very much... We can just set 
stanby_mode=on when we use min_standby_apply_delay with pg_standby and similar 
simply tools. However, I tend to agree with not to need to prohibit except for 
standby_mode. So I'd like to propose that changing parameter name of 
min_standby_apply_delay to min_recovery_apply_delay. It is natural for this 
feature.


Regards,
--
Mitsumasa KONDO
NTT Open Source Software Center


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Time-Delayed Standbys

2013-12-12 Thread Simon Riggs
On 12 December 2013 08:19, KONDO Mitsumasa
kondo.mitsum...@lab.ntt.co.jp wrote:
 (2013/12/12 7:23), Fabrízio de Royes Mello wrote:

 On Wed, Dec 11, 2013 at 7:47 PM, Andres Freund and...@2ndquadrant.com
   * hot_standby=off: Makes delay useable with wal_level=archive (and thus
 a lower WAL volume)
   * standby_mode=off: Configurations that use tools like pg_standby and
 similar simply don't need standby_mode=on. If you want to trigger
 failover from within the restore_command you *cannot* set it.
   * recovery_target_*: It can still make sense if you use
 pause_at_recovery_target.


 I don't think part of his arguments are right very much... We can just set
 stanby_mode=on when we use min_standby_apply_delay with pg_standby and
 similar simply tools. However, I tend to agree with not to need to prohibit
 except for standby_mode. So I'd like to propose that changing parameter name
 of min_standby_apply_delay to min_recovery_apply_delay. It is natural
 for this feature.

OK

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Time-Delayed Standbys

2013-12-12 Thread Simon Riggs
On 9 December 2013 10:54, KONDO Mitsumasa kondo.mitsum...@lab.ntt.co.jp wrote:
 (2013/12/09 19:35), Pavel Stehule wrote:




 2013/12/9 KONDO Mitsumasa kondo.mitsum...@lab.ntt.co.jp
 mailto:kondo.mitsum...@lab.ntt.co.jp


 Hi Fabrízio,

 I test your v4 patch, and send your review comments.

 * Fix typo
   49 -# commited transactions from the master, specify a recovery
 time delay.
   49 +# committed transactions from the master, specify a recovery
 time delay.

 * Fix white space
   177 -   if (secs = 0  microsecs =0)
   177 +   if (secs = 0  microsecs =0 )

 * Add functionality (I propose)
 We can set negative number at min_standby_apply_delay. I think that
 this feature
 is for world wide replication situation. For example, master server is
 in
 Japan and slave server is in San Francisco. Japan time fowards than
 San
 Francisco time
 . And if we want to delay in this situation, it can need negative
 number in
 min_standby_apply_delay. So I propose that time delay conditional
 branch
 change under following.
   - if (min_standby_apply_delay  0)
   + if (min_standby_apply_delay != 0)
 What do you think? It might also be working collectry.


 what using interval instead absolute time?

 This is because local time is recorded in XLOG. And it has big cost for
 calculating global time.

I agree with your request here, but I don't think negative values are
the right way to implement that, at least it would not be very usable.

My suggestion would be to add the TZ to the checkpoint record. This
way all users of WAL can see the TZ of the master and act accordingly.
I'll do a separate patch for that.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Time-Delayed Standbys

2013-12-12 Thread KONDO Mitsumasa

(2013/12/12 18:09), Simon Riggs wrote:

On 9 December 2013 10:54, KONDO Mitsumasa kondo.mitsum...@lab.ntt.co.jp wrote:

(2013/12/09 19:35), Pavel Stehule wrote:





2013/12/9 KONDO Mitsumasa kondo.mitsum...@lab.ntt.co.jp
mailto:kondo.mitsum...@lab.ntt.co.jp


 Hi Fabrízio,

 I test your v4 patch, and send your review comments.

 * Fix typo
   49 -# commited transactions from the master, specify a recovery
time delay.
   49 +# committed transactions from the master, specify a recovery
time delay.

 * Fix white space
   177 -   if (secs = 0  microsecs =0)
   177 +   if (secs = 0  microsecs =0 )

 * Add functionality (I propose)
 We can set negative number at min_standby_apply_delay. I think that
this feature
 is for world wide replication situation. For example, master server is
in
 Japan and slave server is in San Francisco. Japan time fowards than
San
 Francisco time
 . And if we want to delay in this situation, it can need negative
number in
 min_standby_apply_delay. So I propose that time delay conditional
branch
 change under following.
   - if (min_standby_apply_delay  0)
   + if (min_standby_apply_delay != 0)
 What do you think? It might also be working collectry.


what using interval instead absolute time?


This is because local time is recorded in XLOG. And it has big cost for
calculating global time.


I agree with your request here, but I don't think negative values are
the right way to implement that, at least it would not be very usable.
I think that my proposal is the easiest and simplist way to solve this problem. 
And I believe that the man who cannot calculate the difference in time-zone 
doesn't set replication cluster across continents.



My suggestion would be to add the TZ to the checkpoint record. This
way all users of WAL can see the TZ of the master and act accordingly.
I'll do a separate patch for that.

It is something useful for also other situations. However, it might be
going to happen long and complicated discussions... I think that our hope is to 
commit this patch in this commit-fest or next final commit-fest.


Regards,
--
Mitsumasa KONDO
NTT Open Source Software Center


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Time-Delayed Standbys

2013-12-12 Thread Simon Riggs
On 12 December 2013 10:42, KONDO Mitsumasa
kondo.mitsum...@lab.ntt.co.jp wrote:

 I agree with your request here, but I don't think negative values are
 the right way to implement that, at least it would not be very usable.

 I think that my proposal is the easiest and simplist way to solve this
 problem. And I believe that the man who cannot calculate the difference in
 time-zone doesn't set replication cluster across continents.


 My suggestion would be to add the TZ to the checkpoint record. This
 way all users of WAL can see the TZ of the master and act accordingly.
 I'll do a separate patch for that.

 It is something useful for also other situations. However, it might be
 going to happen long and complicated discussions... I think that our hope is
 to commit this patch in this commit-fest or next final commit-fest.

Agreed on no delay for the delay patch, as shown by my commit.

Still think we need better TZ handling.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Time-Delayed Standbys

2013-12-12 Thread Andres Freund
On 2013-12-12 09:09:21 +, Simon Riggs wrote:
  * Add functionality (I propose)
  We can set negative number at min_standby_apply_delay. I think that
  this feature
  is for world wide replication situation. For example, master server is
  in
  Japan and slave server is in San Francisco. Japan time fowards than
  San
  Francisco time
  . And if we want to delay in this situation, it can need negative

  This is because local time is recorded in XLOG. And it has big cost for
  calculating global time.

Uhm? Isn't the timestamp in commit records actually a TimestampTz? And
thus essentially stored as UTC? I don't think this problem actually
exists?

 My suggestion would be to add the TZ to the checkpoint record. This
 way all users of WAL can see the TZ of the master and act accordingly.
 I'll do a separate patch for that.

Intuitively I'd say that might be useful - but I am not reall sure what
for. And we don't exactly have a great interface for looking at a
checkpoint's data. Maybe add it to the control file instead?

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Time-Delayed Standbys

2013-12-12 Thread Simon Riggs
On 12 December 2013 11:05, Andres Freund and...@2ndquadrant.com wrote:

 My suggestion would be to add the TZ to the checkpoint record. This
 way all users of WAL can see the TZ of the master and act accordingly.
 I'll do a separate patch for that.

 Intuitively I'd say that might be useful - but I am not reall sure what
 for. And we don't exactly have a great interface for looking at a
 checkpoint's data. Maybe add it to the control file instead?

That's actually what I had in mind, I just phrased it badly in mid-thought.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Time-Delayed Standbys

2013-12-12 Thread Mitsumasa KONDO
2013/12/12 Simon Riggs si...@2ndquadrant.com

 On 12 December 2013 10:42, KONDO Mitsumasa
 kondo.mitsum...@lab.ntt.co.jp wrote:

  I agree with your request here, but I don't think negative values are
  the right way to implement that, at least it would not be very usable.
 
  I think that my proposal is the easiest and simplist way to solve this
  problem. And I believe that the man who cannot calculate the difference
 in
  time-zone doesn't set replication cluster across continents.
 
 
  My suggestion would be to add the TZ to the checkpoint record. This
  way all users of WAL can see the TZ of the master and act accordingly.
  I'll do a separate patch for that.
 
  It is something useful for also other situations. However, it might be
  going to happen long and complicated discussions... I think that our
 hope is
  to commit this patch in this commit-fest or next final commit-fest.

 Agreed on no delay for the delay patch, as shown by my commit.

Our forecast was very accurate...
Nice commit, Thanks!

Regards,
--
Mitsumasa KONDO
NTT Open Source Software Center


Re: [HACKERS] Time-Delayed Standbys

2013-12-12 Thread Tom Lane
Simon Riggs si...@2ndquadrant.com writes:
 On 12 December 2013 11:05, Andres Freund and...@2ndquadrant.com wrote:
 My suggestion would be to add the TZ to the checkpoint record. This
 way all users of WAL can see the TZ of the master and act accordingly.
 I'll do a separate patch for that.

 Intuitively I'd say that might be useful - but I am not reall sure what
 for. And we don't exactly have a great interface for looking at a
 checkpoint's data. Maybe add it to the control file instead?

 That's actually what I had in mind, I just phrased it badly in mid-thought.

I don't think you realize what a can of worms that would be.  There's
no compact representation of a timezone, unless you are only proposing
to store the UTC offset; and frankly I'm not particularly seeing the point
of that.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Time-Delayed Standbys

2013-12-12 Thread Robert Haas
On Thu, Dec 12, 2013 at 9:52 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 Simon Riggs si...@2ndquadrant.com writes:
 On 12 December 2013 11:05, Andres Freund and...@2ndquadrant.com wrote:
 My suggestion would be to add the TZ to the checkpoint record. This
 way all users of WAL can see the TZ of the master and act accordingly.
 I'll do a separate patch for that.

 Intuitively I'd say that might be useful - but I am not reall sure what
 for. And we don't exactly have a great interface for looking at a
 checkpoint's data. Maybe add it to the control file instead?

 That's actually what I had in mind, I just phrased it badly in mid-thought.

 I don't think you realize what a can of worms that would be.  There's
 no compact representation of a timezone, unless you are only proposing
 to store the UTC offset; and frankly I'm not particularly seeing the point
 of that.

+1.  I can see the point of storing a timestamp in each checkpoint
record, if we don't already, but time zones should be completely
irrelevant to this feature.  Everything should be reckoned in seconds
since the epoch.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Time-Delayed Standbys

2013-12-12 Thread Simon Riggs
On 12 December 2013 15:03, Robert Haas robertmh...@gmail.com wrote:
 On Thu, Dec 12, 2013 at 9:52 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 Simon Riggs si...@2ndquadrant.com writes:
 On 12 December 2013 11:05, Andres Freund and...@2ndquadrant.com wrote:
 My suggestion would be to add the TZ to the checkpoint record. This
 way all users of WAL can see the TZ of the master and act accordingly.
 I'll do a separate patch for that.

 Intuitively I'd say that might be useful - but I am not reall sure what
 for. And we don't exactly have a great interface for looking at a
 checkpoint's data. Maybe add it to the control file instead?

 That's actually what I had in mind, I just phrased it badly in mid-thought.

 I don't think you realize what a can of worms that would be.  There's
 no compact representation of a timezone, unless you are only proposing
 to store the UTC offset; and frankly I'm not particularly seeing the point
 of that.

 +1.  I can see the point of storing a timestamp in each checkpoint
 record, if we don't already, but time zones should be completely
 irrelevant to this feature.  Everything should be reckoned in seconds
 since the epoch.

Don't panic guys! I meant UTC offset only. And yes, it may not be
needed, will check.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Time-Delayed Standbys

2013-12-12 Thread Simon Riggs
On 12 December 2013 15:19, Simon Riggs si...@2ndquadrant.com wrote:

 Don't panic guys! I meant UTC offset only. And yes, it may not be
 needed, will check.

Checked, all non-UTC TZ offsets work without further effort here.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Time-Delayed Standbys

2013-12-12 Thread Fabrízio de Royes Mello
On Thu, Dec 12, 2013 at 3:39 PM, Simon Riggs si...@2ndquadrant.com wrote:

 On 12 December 2013 15:19, Simon Riggs si...@2ndquadrant.com wrote:

  Don't panic guys! I meant UTC offset only. And yes, it may not be
  needed, will check.

 Checked, all non-UTC TZ offsets work without further effort here.


Thanks!

--
Fabrízio de Royes Mello
Consultoria/Coaching PostgreSQL
 Timbira: http://www.timbira.com.br
 Blog sobre TI: http://fabriziomello.blogspot.com
 Perfil Linkedin: http://br.linkedin.com/in/fabriziomello
 Twitter: http://twitter.com/fabriziomello


Re: [HACKERS] Time-Delayed Standbys

2013-12-12 Thread Fabrízio de Royes Mello
On Thu, Dec 12, 2013 at 3:42 PM, Fabrízio de Royes Mello 
fabriziome...@gmail.com wrote:

 On Thu, Dec 12, 2013 at 3:39 PM, Simon Riggs si...@2ndquadrant.com
wrote:
 
  On 12 December 2013 15:19, Simon Riggs si...@2ndquadrant.com wrote:
 
   Don't panic guys! I meant UTC offset only. And yes, it may not be
   needed, will check.
 
  Checked, all non-UTC TZ offsets work without further effort here.
 

 Thanks!


Reviewing the committed patch I noted that the CheckForStandbyTrigger()
after the delay was removed.

If we promote the standby during the delay and don't check the trigger
immediately after the delay, then we will replay undesired WALs records.

The attached patch add this check.

Regards,

--
Fabrízio de Royes Mello
Consultoria/Coaching PostgreSQL
 Timbira: http://www.timbira.com.br
 Blog sobre TI: http://fabriziomello.blogspot.com
 Perfil Linkedin: http://br.linkedin.com/in/fabriziomello
 Twitter: http://twitter.com/fabriziomello
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index a76aef3..fbc2d2f 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -6835,6 +6835,14 @@ StartupXLOG(void)
 	recoveryApplyDelay();
 
 	/*
+	 * Check for standby trigger to prevent the
+	 * replay of undesired WAL records if the
+	 * slave was promoted during the delay.
+	 */
+	if (CheckForStandbyTrigger())
+		break;
+
+	/*
 	 * We test for paused recovery again here. If
 	 * user sets delayed apply, it may be because
 	 * they expect to pause recovery in case of

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Time-Delayed Standbys

2013-12-11 Thread Simon Riggs
On 11 December 2013 06:36, KONDO Mitsumasa
kondo.mitsum...@lab.ntt.co.jp wrote:

 I think this feature will be used in a lot of scenarios in
 which PITR is currently used.

 We have to judge which is better, we get something potential or to protect
 stupid.
 And we had better to wait author's comment...

I'd say just document that it wouldn't make sense to use it for PITR.

There may be some use case we can't see yet, so specifically
prohibiting a use case that is not dangerous seems too much at this
point. I will no doubt be reminded of these words in the future...

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Time-Delayed Standbys

2013-12-11 Thread Fabrízio de Royes Mello
On Wed, Dec 11, 2013 at 6:27 AM, Simon Riggs si...@2ndquadrant.com wrote:

 On 11 December 2013 06:36, KONDO Mitsumasa
 kondo.mitsum...@lab.ntt.co.jp wrote:

  I think this feature will be used in a lot of scenarios in
  which PITR is currently used.
 
  We have to judge which is better, we get something potential or to
protect
  stupid.
  And we had better to wait author's comment...

 I'd say just document that it wouldn't make sense to use it for PITR.

 There may be some use case we can't see yet, so specifically
 prohibiting a use case that is not dangerous seems too much at this
 point. I will no doubt be reminded of these words in the future...


Hi all,

I tend to agree with Simon, but I confess that I don't liked to delay a
server with standby_mode = 'off'.

The main goal of this patch is delay the Streaming Replication, so if the
slave server isn't a hot-standby I think makes no sense to delay it.

Mitsumasa suggested to add StandbyModeRequested in conditional branch to
skip this situation. I agree with him!

And I'll change 'recoveryDelay' (functions, variables) to 'standbyDelay'.

Regards,

--
Fabrízio de Royes Mello
Consultoria/Coaching PostgreSQL
 Timbira: http://www.timbira.com.br
 Blog sobre TI: http://fabriziomello.blogspot.com
 Perfil Linkedin: http://br.linkedin.com/in/fabriziomello
 Twitter: http://twitter.com/fabriziomello


Re: [HACKERS] Time-Delayed Standbys

2013-12-11 Thread Andres Freund
On 2013-12-11 16:37:54 -0200, Fabrízio de Royes Mello wrote:
 On Wed, Dec 11, 2013 at 6:27 AM, Simon Riggs si...@2ndquadrant.com wrote:
   I think this feature will be used in a lot of scenarios in
   which PITR is currently used.
  
   We have to judge which is better, we get something potential or to protect
   stupid.
   And we had better to wait author's comment...
 
  I'd say just document that it wouldn't make sense to use it for PITR.
 
  There may be some use case we can't see yet, so specifically
  prohibiting a use case that is not dangerous seems too much at this
  point. I will no doubt be reminded of these words in the future...

 I tend to agree with Simon, but I confess that I don't liked to delay a
 server with standby_mode = 'off'.

 The main goal of this patch is delay the Streaming Replication, so if the
 slave server isn't a hot-standby I think makes no sense to delay it.

 Mitsumasa suggested to add StandbyModeRequested in conditional branch to
 skip this situation. I agree with him!

I don't think that position has any merit, sorry: Think about the way
this stuff gets setup. The user creates a new basebackup (pg_basebackup,
manual pg_start/stop_backup, shutdown primary). Then he creates a
recovery conf by either starting from scratch, using
--write-recovery-conf or by copying recovery.conf.sample. In none of
these cases delay will be configured.

So, with that in mind, the only way it could have been configured is by
the user *explicitly* writing it into recovery.conf. And now you want to
to react to this explicit step by just *silently* ignoring the setting
based on some random criteria (arguments have been made about
hot_standby=on/off, standby_mode=on/off which aren't directly
related). Why on earth would that by a usability improvement?

Also, you seem to assume there's no point in configuring it for any of
hot_standby=off, standby_mode=off, recovery_target=*. Why? There's
usecases for all of them:
* hot_standby=off: Makes delay useable with wal_level=archive (and thus
  a lower WAL volume)
* standby_mode=off: Configurations that use tools like pg_standby and
  similar simply don't need standby_mode=on. If you want to trigger
  failover from within the restore_command you *cannot* set it.
* recovery_target_*: It can still make sense if you use
  pause_at_recovery_target.

In which scenarios does your restriction actually improve anything?

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Time-Delayed Standbys

2013-12-11 Thread Fabrízio de Royes Mello
On Wed, Dec 11, 2013 at 7:47 PM, Andres Freund and...@2ndquadrant.com
wrote:

 I don't think that position has any merit, sorry: Think about the way
 this stuff gets setup. The user creates a new basebackup (pg_basebackup,
 manual pg_start/stop_backup, shutdown primary). Then he creates a
 recovery conf by either starting from scratch, using
 --write-recovery-conf or by copying recovery.conf.sample. In none of
 these cases delay will be configured.


Ok.


 So, with that in mind, the only way it could have been configured is by
 the user *explicitly* writing it into recovery.conf. And now you want to
 to react to this explicit step by just *silently* ignoring the setting
 based on some random criteria (arguments have been made about
 hot_standby=on/off, standby_mode=on/off which aren't directly
 related). Why on earth would that by a usability improvement?

 Also, you seem to assume there's no point in configuring it for any of
 hot_standby=off, standby_mode=off, recovery_target=*. Why? There's
 usecases for all of them:
 * hot_standby=off: Makes delay useable with wal_level=archive (and thus
   a lower WAL volume)
 * standby_mode=off: Configurations that use tools like pg_standby and
   similar simply don't need standby_mode=on. If you want to trigger
   failover from within the restore_command you *cannot* set it.
 * recovery_target_*: It can still make sense if you use
   pause_at_recovery_target.

 In which scenarios does your restriction actually improve anything?


Given your arguments I'm forced to review my understanding of the problem.
You are absolutely right in your assertions. I was not seeing the scenario
on this perspective.

Anyway we need to improve docs, any suggestions?

Regards,

--
Fabrízio de Royes Mello
Consultoria/Coaching PostgreSQL
 Timbira: http://www.timbira.com.br
 Blog sobre TI: http://fabriziomello.blogspot.com
 Perfil Linkedin: http://br.linkedin.com/in/fabriziomello
 Twitter: http://twitter.com/fabriziomello


Re: [HACKERS] Time-Delayed Standbys

2013-12-10 Thread Andres Freund
On 2013-12-10 13:26:27 +0900, KONDO Mitsumasa wrote:
 (2013/12/09 20:29), Andres Freund wrote:
 On 2013-12-09 19:51:01 +0900, KONDO Mitsumasa wrote:
 Add my comment. We have to consider three situations.
 
 1. PITR
 2. replication standby
 3. replication standby with restore_command
 
 I think this patch cannot delay in 1 situation.
 
 Why?
 
 I have three reasons.

None of these reasons seem to be of technical nature, right?

   1. It is written in document. Can we remove it?
   2. Name of this feature is Time-delayed *standbys*, not Time-delayed
  *recovery*. Can we change it?

I don't think that'd be a win in clarity. But perhaps somebody else has
a better suggestion?

   3. I think it is unnessesary in master PITR. And if it can delay in master
  PITR, it will become master at unexpected timing, not to continue to
  recovery. It is meaningless.

master PITR? What's that? All PITR is based on recovery.conf and thus
not really a master?

Why should we prohibit using this feature in PITR? I don't see any
advantage in doing so. If somebody doesn't want the delay, they
shouldn't set it in the configuration file. End of story.

There's not really a that meaningful distinction between PITR and
replication using archive_command. Especially when using
*pause_after. I think this feature will be used in a lot of scenarios in
which PITR is currently used.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Time-Delayed Standbys

2013-12-10 Thread KONDO Mitsumasa

(2013/12/10 18:38), Andres Freund wrote:

master PITR? What's that? All PITR is based on recovery.conf and thus
not really a master?
master PITR is PITR with standby_mode = off. It's just recovery from 
basebackup. They have difference between master PITR and standby that the 
former will be independent timelineID, but the latter is same timeline ID taht 
following the master sever. In the first place, purposes are different.



Why should we prohibit using this feature in PITR? I don't see any
advantage in doing so. If somebody doesn't want the delay, they
shouldn't set it in the configuration file. End of story.
Unfortunately, there are a lot of stupid in the world... I think you have these 
clients, too.



There's not really a that meaningful distinction between PITR and
replication using archive_command. Especially when using
*pause_after.
It is meaningless in master PITR. It will be master which has new timelineID at 
unexpected timing.



I think this feature will be used in a lot of scenarios in
which PITR is currently used.

We have to judge which is better, we get something potential or to protect 
stupid.
And we had better to wait author's comment...

Regards,
--
Mitsumasa KONDO
NTT Open Source Software Center


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Time-Delayed Standbys

2013-12-09 Thread KONDO Mitsumasa

Hi Fabrízio,

I test your v4 patch, and send your review comments.

* Fix typo
 49 -# commited transactions from the master, specify a recovery time delay.
 49 +# committed transactions from the master, specify a recovery time delay.

* Fix white space
 177 -   if (secs = 0  microsecs =0)
 177 +   if (secs = 0  microsecs =0 )

* Add functionality (I propose)
We can set negative number at min_standby_apply_delay. I think that this feature
is for world wide replication situation. For example, master server is in Japan 
and slave server is in San Francisco. Japan time fowards than San Francisco time
. And if we want to delay in this situation, it can need negative number in 
min_standby_apply_delay. So I propose that time delay conditional branch change 
under following.

 - if (min_standby_apply_delay  0)
 + if (min_standby_apply_delay != 0)
What do you think? It might also be working collectry.


* Problem 1
I read your wittened document. There is PITR has not affected.
However, when I run PITR with min_standby_apply_delay=300, it cannot start 
server. The log is under following.

[mitsu-ko@localhost postgresql]$ bin/pg_ctl -D data2 start
server starting
[mitsu-ko@localhost postgresql]$ LOG:  database system was interrupted; last 
known up at 2013-12-08 18:57:00 JST
LOG:  creating missing WAL directory pg_xlog/archive_status
cp: cannot stat `../arc/0002.history':
LOG:  starting archive recovery
LOG:  restored log file 00010041 from archive
LOG:  redo starts at 0/4128
LOG:  consistent recovery state reached at 0/41F0
LOG:  database system is ready to accept read only connections
LOG:  restored log file 00010042 from archive
FATAL:  cannot wait on a latch owned by another process
LOG:  startup process (PID 30501) exited with exit code 1
LOG:  terminating any other active server processes

We need recovery flag for controling PITR situation.


That's all for now.
If you are busy, please fix in your pace. I'm busy and I'd like to wait your 
time, too:-)


Regards,
--
Mitsumasa KONDO
NTT Open Source Software Center


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Time-Delayed Standbys

2013-12-09 Thread Pavel Stehule
2013/12/9 KONDO Mitsumasa kondo.mitsum...@lab.ntt.co.jp

 Hi Fabrízio,

 I test your v4 patch, and send your review comments.

 * Fix typo
  49 -# commited transactions from the master, specify a recovery time
 delay.
  49 +# committed transactions from the master, specify a recovery time
 delay.

 * Fix white space
  177 -   if (secs = 0  microsecs =0)
  177 +   if (secs = 0  microsecs =0 )

 * Add functionality (I propose)
 We can set negative number at min_standby_apply_delay. I think that this
 feature
 is for world wide replication situation. For example, master server is in
 Japan and slave server is in San Francisco. Japan time fowards than San
 Francisco time
 . And if we want to delay in this situation, it can need negative number
 in min_standby_apply_delay. So I propose that time delay conditional branch
 change under following.
  - if (min_standby_apply_delay  0)
  + if (min_standby_apply_delay != 0)
 What do you think? It might also be working collectry.


what using interval instead absolute time?

Regards

Pavel




 * Problem 1
 I read your wittened document. There is PITR has not affected.
 However, when I run PITR with min_standby_apply_delay=300, it cannot
 start server. The log is under following.

 [mitsu-ko@localhost postgresql]$ bin/pg_ctl -D data2 start
 server starting
 [mitsu-ko@localhost postgresql]$ LOG:  database system was interrupted;
 last known up at 2013-12-08 18:57:00 JST
 LOG:  creating missing WAL directory pg_xlog/archive_status
 cp: cannot stat `../arc/0002.history':
 LOG:  starting archive recovery
 LOG:  restored log file 00010041 from archive
 LOG:  redo starts at 0/4128
 LOG:  consistent recovery state reached at 0/41F0
 LOG:  database system is ready to accept read only connections
 LOG:  restored log file 00010042 from archive
 FATAL:  cannot wait on a latch owned by another process
 LOG:  startup process (PID 30501) exited with exit code 1
 LOG:  terminating any other active server processes

 We need recovery flag for controling PITR situation.


 That's all for now.
 If you are busy, please fix in your pace. I'm busy and I'd like to wait
 your time, too:-)

 Regards,
 --
 Mitsumasa KONDO
 NTT Open Source Software Center


 --
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-hackers



Re: [HACKERS] Time-Delayed Standbys

2013-12-09 Thread KONDO Mitsumasa

(2013/12/09 19:36), KONDO Mitsumasa wrote:

* Problem 1
I read your wittened document. There is PITR has not affected.
However, when I run PITR with min_standby_apply_delay=300, it cannot start
server. The log is under following.

[mitsu-ko@localhost postgresql]$ bin/pg_ctl -D data2 start
server starting
[mitsu-ko@localhost postgresql]$ LOG:  database system was interrupted; last
known up at 2013-12-08 18:57:00 JST
LOG:  creating missing WAL directory pg_xlog/archive_status
cp: cannot stat `../arc/0002.history':
LOG:  starting archive recovery
LOG:  restored log file 00010041 from archive
LOG:  redo starts at 0/4128
LOG:  consistent recovery state reached at 0/41F0
LOG:  database system is ready to accept read only connections
LOG:  restored log file 00010042 from archive
FATAL:  cannot wait on a latch owned by another process
LOG:  startup process (PID 30501) exited with exit code 1
LOG:  terminating any other active server processes

We need recovery flag for controling PITR situation.

Add my comment. We have to consider three situations.

1. PITR
2. replication standby
3. replication standby with restore_command

I think this patch cannot delay in 1 situation. So I think you should add only 
StandbyModeRequested flag in conditional branch.


Regards,
--
Mitsumasa KONDO
NTT Open Source Software Center








--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Time-Delayed Standbys

2013-12-09 Thread KONDO Mitsumasa

(2013/12/09 19:35), Pavel Stehule wrote:




2013/12/9 KONDO Mitsumasa kondo.mitsum...@lab.ntt.co.jp
mailto:kondo.mitsum...@lab.ntt.co.jp

Hi Fabrízio,

I test your v4 patch, and send your review comments.

* Fix typo
  49 -# commited transactions from the master, specify a recovery time 
delay.
  49 +# committed transactions from the master, specify a recovery time 
delay.

* Fix white space
  177 -   if (secs = 0  microsecs =0)
  177 +   if (secs = 0  microsecs =0 )

* Add functionality (I propose)
We can set negative number at min_standby_apply_delay. I think that this 
feature
is for world wide replication situation. For example, master server is in
Japan and slave server is in San Francisco. Japan time fowards than San
Francisco time
. And if we want to delay in this situation, it can need negative number in
min_standby_apply_delay. So I propose that time delay conditional branch
change under following.
  - if (min_standby_apply_delay  0)
  + if (min_standby_apply_delay != 0)
What do you think? It might also be working collectry.


what using interval instead absolute time?
This is because local time is recorded in XLOG. And it has big cost for 
calculating global time.


Regards,
--
Mitsumasa KONDO
NTT Open Source Software Center


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Time-Delayed Standbys

2013-12-09 Thread Andres Freund
On 2013-12-09 19:51:01 +0900, KONDO Mitsumasa wrote:
 Add my comment. We have to consider three situations.
 
 1. PITR
 2. replication standby
 3. replication standby with restore_command
 
 I think this patch cannot delay in 1 situation.

Why?

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Time-Delayed Standbys

2013-12-09 Thread Craig Ringer
On 12/04/2013 02:46 AM, Robert Haas wrote:
 Thanks for your review Christian...
 
 So, I proposed this patch previously and I still think it's a good
 idea, but it got voted down on the grounds that it didn't deal with
 clock drift.  I view that as insufficient reason to reject the
 feature, but others disagreed.  Unless some of those people have
 changed their minds, I don't think this patch has much future here.

Surely that's the operating system / VM host / sysadmin / whatever's
problem?

The only way to deal with clock drift that isn't fragile in the face
of variable latency, etc, is to basically re-implement (S)NTP in order
to find out what the clock difference with the remote is.

If we're going to do that, why not just let the OS deal with it?

It might well be worth complaining about obvious aberrations like
timestamps in the local future - preferably by complaining and not
actually dying. It does need to be able to cope with a *skewing* clock,
but I'd be surprised if it had any issues there in the first place.

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Time-Delayed Standbys

2013-12-09 Thread Greg Stark
On 9 Dec 2013 12:16, Craig Ringer cr...@2ndquadrant.com wrote:

 The only way to deal with clock drift that isn't fragile in the face
 of variable latency, etc, is to basically re-implement (S)NTP in order
 to find out what the clock difference with the remote is.

There's actually an entirely different way to deal with clock drift: test
master time and slave time as two different incomparable spaces.
Similar to how you would treat measurements in different units.

If you do that then you can measure and manage the delay in the slave
between receiving and applying a record and also measure the amount of
master server time which can be pending. These measurements don't depend at
all on time sync between servers.

The specified feature depends explicitly on the conversion between master
and slave time spaces so it's inevitable that sync would be an issue. It
might be nice to print a warning on connection if the time is far out of
sync or periodically check. But I don't think reimplementing NTP is a good
idea.


Re: [HACKERS] Time-Delayed Standbys

2013-12-09 Thread KONDO Mitsumasa

(2013/12/09 20:29), Andres Freund wrote:

On 2013-12-09 19:51:01 +0900, KONDO Mitsumasa wrote:

Add my comment. We have to consider three situations.

1. PITR
2. replication standby
3. replication standby with restore_command

I think this patch cannot delay in 1 situation.


Why?


I have three reasons.

  1. It is written in document. Can we remove it?
  2. Name of this feature is Time-delayed *standbys*, not Time-delayed
 *recovery*. Can we change it?
  3. I think it is unnessesary in master PITR. And if it can delay in master
 PITR, it will become master at unexpected timing, not to continue to
 recovery. It is meaningless.

I'd like to ask you what do you expect from this feature and how to use it.

Regards,
--
Mitsumasa KONDO
NTT Open Source Software Center


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Time-Delayed Standbys

2013-12-06 Thread Robert Haas
On Thu, Dec 5, 2013 at 11:07 PM, Fabrízio de Royes Mello
fabriziome...@gmail.com wrote:
 On Tue, Dec 3, 2013 at 5:33 PM, Simon Riggs si...@2ndquadrant.com wrote:

  - compute recoveryUntilDelayTime in XLOG_XACT_COMMIT and
  XLOG_XACT_COMMIT_COMPACT checks

 Why just those? Why not aborts and restore points also?


 I think make no sense execute the delay after aborts and/or restore points,
 because it not change data in a standby server.

I see no reason to pause for aborts.  Aside from the fact that it
wouldn't be reliable in corner cases, as Fabrízio says, there's no
user-visible effect, just as there's no user-visible effect from
replaying a transaction up until just prior to the point where it
commits (which we also do).

Waiting for restore points seems like it potentially makes sense.  If
the standby is delayed by an hour, and you create a restore point and
wait 55 minutes, you might expect that that you can still kill the
standby and recover it to that restore point.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Time-Delayed Standbys

2013-12-06 Thread Fabrízio de Royes Mello
On Fri, Dec 6, 2013 at 1:36 PM, Robert Haas robertmh...@gmail.com wrote:

 On Thu, Dec 5, 2013 at 11:07 PM, Fabrízio de Royes Mello
 fabriziome...@gmail.com wrote:
  On Tue, Dec 3, 2013 at 5:33 PM, Simon Riggs si...@2ndquadrant.com
 wrote:
 
   - compute recoveryUntilDelayTime in XLOG_XACT_COMMIT and
   XLOG_XACT_COMMIT_COMPACT checks
 
  Why just those? Why not aborts and restore points also?
 
 
  I think make no sense execute the delay after aborts and/or restore
 points,
  because it not change data in a standby server.

 I see no reason to pause for aborts.  Aside from the fact that it
 wouldn't be reliable in corner cases, as Fabrízio says, there's no
 user-visible effect, just as there's no user-visible effect from
 replaying a transaction up until just prior to the point where it
 commits (which we also do).

 Waiting for restore points seems like it potentially makes sense.  If
 the standby is delayed by an hour, and you create a restore point and
 wait 55 minutes, you might expect that that you can still kill the
 standby and recover it to that restore point.


Makes sense. Fixed.

Regards,

-- 
Fabrízio de Royes Mello
Consultoria/Coaching PostgreSQL
 Timbira: http://www.timbira.com.br
 Blog sobre TI: http://fabriziomello.blogspot.com
 Perfil Linkedin: http://br.linkedin.com/in/fabriziomello
 Twitter: http://twitter.com/fabriziomello
diff --git a/doc/src/sgml/recovery-config.sgml b/doc/src/sgml/recovery-config.sgml
index 9d80256..12aa917 100644
--- a/doc/src/sgml/recovery-config.sgml
+++ b/doc/src/sgml/recovery-config.sgml
@@ -142,6 +142,31 @@ restore_command = 'copy C:\\server\\archivedir\\%f %p'  # Windows
   /listitem
  /varlistentry
 
+ varlistentry id=min-standby-apply-delay xreflabel=min_standby_apply_delay
+  termvarnamemin_standby_apply_delay/varname (typeinteger/type)/term
+  indexterm
+primaryvarnamemin_standby_apply_delay/ recovery parameter/primary
+  /indexterm
+  listitem
+   para
+Specifies the amount of time (in milliseconds, if no unit is specified)
+which recovery of transaction commits should lag the master.  This
+parameter allows creation of a time-delayed standby.  For example, if
+you set this parameter to literal5min/literal, the standby will
+replay each transaction commit only when the system time on the standby
+is at least five minutes past the commit time reported by the master.
+   /para
+   para
+Note that if the master and standby system clocks are not synchronized,
+this might lead to unexpected results.
+   /para
+   para
+This parameter works only for streaming replication deployments. Synchronous
+replicas and PITR has not affected.
+   /para
+  /listitem
+ /varlistentry
+
 /variablelist
 
   /sect1
diff --git a/src/backend/access/transam/recovery.conf.sample b/src/backend/access/transam/recovery.conf.sample
index 5acfa57..e8617db 100644
--- a/src/backend/access/transam/recovery.conf.sample
+++ b/src/backend/access/transam/recovery.conf.sample
@@ -123,6 +123,17 @@
 #
 #trigger_file = ''
 #
+# min_standby_apply_delay
+#
+# By default, a standby server keeps restoring XLOG records from the
+# primary as soon as possible. If you want to delay the replay of
+# commited transactions from the master, specify a recovery time delay.
+# For example, if you set this parameter to 5min, the standby will replay
+# each transaction commit only when the system time on the standby is least
+# five minutes past the commit time reported by the master.
+#
+#min_standby_apply_delay = 0
+#
 #---
 # HOT STANDBY PARAMETERS
 #---
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index b68230d..7ca2f9b 100755
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -218,6 +218,8 @@ static bool recoveryPauseAtTarget = true;
 static TransactionId recoveryTargetXid;
 static TimestampTz recoveryTargetTime;
 static char *recoveryTargetName;
+static int min_standby_apply_delay = 0;
+static TimestampTz recoveryDelayUntilTime;
 
 /* options taken from recovery.conf for XLOG streaming */
 static bool StandbyModeRequested = false;
@@ -730,6 +732,7 @@ static void readRecoveryCommandFile(void);
 static void exitArchiveRecovery(TimeLineID endTLI, XLogSegNo endLogSegNo);
 static bool recoveryStopsHere(XLogRecord *record, bool *includeThis);
 static void recoveryPausesHere(void);
+static void recoveryDelay(void);
 static void SetLatestXTime(TimestampTz xtime);
 static void SetCurrentChunkStartTime(TimestampTz xtime);
 static void CheckRequiredParameterValues(void);
@@ -5474,6 +5477,19 @@ readRecoveryCommandFile(void)
 	(errmsg_internal(trigger_file = '%s',
 	 TriggerFile)));
 		}
+		else if (strcmp(item-name, min_standby_apply_delay) == 0)

Re: [HACKERS] Time-Delayed Standbys

2013-12-05 Thread Magnus Hagander
On Thu, Dec 5, 2013 at 1:45 AM, Simon Riggs si...@2ndquadrant.com wrote:

 On 3 December 2013 18:46, Robert Haas robertmh...@gmail.com wrote:
  On Tue, Dec 3, 2013 at 12:36 PM, Fabrízio de Royes Mello
  fabriziome...@gmail.com wrote:
  On Tue, Dec 3, 2013 at 2:33 PM, Christian Kruse 
 christ...@2ndquadrant.com
  wrote:
 
  Hi Fabrizio,
 
  looks good to me. I did some testing on 9.2.4, 9.2.5 and HEAD. It
  applies and compiles w/o errors or warnings. I set up a master and two
  hot standbys replicating from the master, one with 5 minutes delay and
  one without delay. After that I created a new database and generated
  some test data:
 
  CREATE TABLE test (val INTEGER);
  INSERT INTO test (val) (SELECT * FROM generate_series(0, 100));
 
  The non-delayed standby nearly instantly had the data replicated, the
  delayed standby was replicated after exactly 5 minutes. I did not
  notice any problems, errors or warnings.
 
 
  Thanks for your review Christian...
 
  So, I proposed this patch previously and I still think it's a good
  idea, but it got voted down on the grounds that it didn't deal with
  clock drift.  I view that as insufficient reason to reject the
  feature, but others disagreed.  Unless some of those people have
  changed their minds, I don't think this patch has much future here.

 I had that objection and others. Since then many people have requested
 this feature and have persuaded me that this is worth having and that
 my objections are minor points. I now agree with the need for the
 feature, almost as written.



Not recalling the older thread, but it seems the breaks on clock drift, I
think we can fairly easily make that situation good enough. Just have
IDENTIFY_SYSTEM return the current timestamp on the master, and refuse to
start if the time difference is too great. Yes, that doesn't catch the case
when the machines are in perfect sync when they start up and drift *later*,
but it will catch the most common cases I bet. But I think that's good
enough that we can accept the feature, given that *most* people will have
ntp, and that it's a very useful feature for those people. But we could
help people who run into it because of a simple config error..

Or maybe the suggested patch already does this, in which case ignore that
part :)

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/


Re: [HACKERS] Time-Delayed Standbys

2013-12-05 Thread Simon Riggs
On 5 December 2013 08:51, Magnus Hagander mag...@hagander.net wrote:

 Not recalling the older thread, but it seems the breaks on clock drift, I
 think we can fairly easily make that situation good enough. Just have
 IDENTIFY_SYSTEM return the current timestamp on the master, and refuse to
 start if the time difference is too great. Yes, that doesn't catch the case
 when the machines are in perfect sync when they start up and drift *later*,
 but it will catch the most common cases I bet. But I think that's good
 enough that we can accept the feature, given that *most* people will have
 ntp, and that it's a very useful feature for those people. But we could help
 people who run into it because of a simple config error..

 Or maybe the suggested patch already does this, in which case ignore that
 part :)

I think the very nature of *this* feature is that it doesnt *require*
the clocks to be exactly in sync, even though that is the basis for
measurement.

The setting of this parameter for sane usage would be minimum 5 mins,
but more likely 30 mins, 1 hour or more.

In that case, a few seconds drift either way makes no real difference
to this feature.

So IMHO, without prejudice to other features that may be more
critically reliant on time synchronisation, we are OK to proceed with
this specific feature.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Time-Delayed Standbys

2013-12-05 Thread Fabrízio de Royes Mello
On Thu, Dec 5, 2013 at 7:57 AM, Simon Riggs si...@2ndquadrant.com wrote:

 On 5 December 2013 08:51, Magnus Hagander mag...@hagander.net wrote:

  Not recalling the older thread, but it seems the breaks on clock
 drift, I
  think we can fairly easily make that situation good enough. Just have
  IDENTIFY_SYSTEM return the current timestamp on the master, and refuse to
  start if the time difference is too great. Yes, that doesn't catch the
 case
  when the machines are in perfect sync when they start up and drift
 *later*,
  but it will catch the most common cases I bet. But I think that's good
  enough that we can accept the feature, given that *most* people will have
  ntp, and that it's a very useful feature for those people. But we could
 help
  people who run into it because of a simple config error..
 
  Or maybe the suggested patch already does this, in which case ignore that
  part :)

 I think the very nature of *this* feature is that it doesnt *require*
 the clocks to be exactly in sync, even though that is the basis for
 measurement.

 The setting of this parameter for sane usage would be minimum 5 mins,
 but more likely 30 mins, 1 hour or more.

 In that case, a few seconds drift either way makes no real difference
 to this feature.

 So IMHO, without prejudice to other features that may be more
 critically reliant on time synchronisation, we are OK to proceed with
 this specific feature.


Hi all,

I saw the comments of all of you. I'm a few busy with some customers issues
(has been a crazy week), but I'll reply and/or fix your suggestions later.

Thanks for all review and sorry to delay in reply.

Regards,

-- 
Fabrízio de Royes Mello
Consultoria/Coaching PostgreSQL
 Timbira: http://www.timbira.com.br
 Blog sobre TI: http://fabriziomello.blogspot.com
 Perfil Linkedin: http://br.linkedin.com/in/fabriziomello
 Twitter: http://twitter.com/fabriziomello


Re: [HACKERS] Time-Delayed Standbys

2013-12-05 Thread Fabrízio de Royes Mello
On Tue, Dec 3, 2013 at 5:33 PM, Simon Riggs si...@2ndquadrant.com wrote:

  - compute recoveryUntilDelayTime in XLOG_XACT_COMMIT and
  XLOG_XACT_COMMIT_COMPACT checks

 Why just those? Why not aborts and restore points also?


I think make no sense execute the delay after aborts and/or restore points,
because it not change data in a standby server.


  - don't care about clockdrift because it's an admin problem.

 Few minor points on things

 * The code with comment Clear any previous recovery delay time is in
 wrong place, move down or remove completely. Setting the delay to zero
 doesn't prevent calling recoveryDelay(), so that logic looks wrong
 anyway.


Fixed.


 * The loop exit in recoveryDelay() is inelegant, should break if = 0


Fixed.


 * There's a spelling mistake in sample


Fixed.


 * The patch has whitespace in one place


Fixed.


 and one important point...

 * The delay loop happens AFTER we check for a pause. Which means if
 the user notices a problem on a commit, then hits pause button on the
 standby, the pause will have no effect and the next commit will be
 applied anyway. Maybe just one commit, but its an off by one error
 that removes the benefit of the patch. So I think we need to test this
 again after we finish delaying

 if (xlogctl-recoveryPause)
   recoveryPausesHere();


Fixed.



 We need to explain in the docs that this is intended only for use in a
 live streaming deployment. It will have little or no meaning in a
 PITR.


Fixed.


 I think recovery_time_delay should be called
 something_apply_delay
 to highlight the point that it is the apply of records that is
 delayed, not the receipt. And hence the need to document that sync rep
 is NOT slowed down by setting this value.


Fixed.


 And to make the name consistent with other parameters, I suggest
 min_standby_apply_delay


I agree. Fixed!


 We also need to document caveats about the patch, in that it only
 delays on timestamped WAL records and other records may be applied
 sooner than the delay in some circumstances, so it is not a way to
 avoid all cancellations.

 We also need to document the behaviour of the patch is to apply all
 data received as quickly as possible once triggered, so the specified
 delay does not slow down promoting the server to a master. That might
 also be seen as a negative behaviour, since promoting the master
 effectively sets recovery_time_delay to zero.

 I will handle the additional documentation, if you can update the
 patch with the main review comments. Thanks.


Thanks, your help is welcome.

Att,

--
Fabrízio de Royes Mello
Consultoria/Coaching PostgreSQL
 Timbira: http://www.timbira.com.br
 Blog sobre TI: http://fabriziomello.blogspot.com
 Perfil Linkedin: http://br.linkedin.com/in/fabriziomello
 Twitter: http://twitter.com/fabriziomello
diff --git a/doc/src/sgml/recovery-config.sgml b/doc/src/sgml/recovery-config.sgml
index 9d80256..12aa917 100644
--- a/doc/src/sgml/recovery-config.sgml
+++ b/doc/src/sgml/recovery-config.sgml
@@ -142,6 +142,31 @@ restore_command = 'copy C:\\server\\archivedir\\%f %p'  # Windows
   /listitem
  /varlistentry
 
+ varlistentry id=min-standby-apply-delay xreflabel=min_standby_apply_delay
+  termvarnamemin_standby_apply_delay/varname (typeinteger/type)/term
+  indexterm
+primaryvarnamemin_standby_apply_delay/ recovery parameter/primary
+  /indexterm
+  listitem
+   para
+Specifies the amount of time (in milliseconds, if no unit is specified)
+which recovery of transaction commits should lag the master.  This
+parameter allows creation of a time-delayed standby.  For example, if
+you set this parameter to literal5min/literal, the standby will
+replay each transaction commit only when the system time on the standby
+is at least five minutes past the commit time reported by the master.
+   /para
+   para
+Note that if the master and standby system clocks are not synchronized,
+this might lead to unexpected results.
+   /para
+   para
+This parameter works only for streaming replication deployments. Synchronous
+replicas and PITR has not affected.
+   /para
+  /listitem
+ /varlistentry
+
 /variablelist
 
   /sect1
diff --git a/src/backend/access/transam/recovery.conf.sample b/src/backend/access/transam/recovery.conf.sample
index 5acfa57..e8617db 100644
--- a/src/backend/access/transam/recovery.conf.sample
+++ b/src/backend/access/transam/recovery.conf.sample
@@ -123,6 +123,17 @@
 #
 #trigger_file = ''
 #
+# min_standby_apply_delay
+#
+# By default, a standby server keeps restoring XLOG records from the
+# primary as soon as possible. If you want to delay the replay of
+# commited transactions from the master, specify a recovery time delay.
+# For example, if you set this parameter to 5min, the standby will replay
+# each transaction commit only when the system time on the standby is least
+# 

Re: [HACKERS] Time-Delayed Standbys

2013-12-04 Thread Andres Freund
On 2013-12-04 11:13:58 +0900, KONDO Mitsumasa wrote:
 4) Start the slave and connect to it using psql and in another session I can 
 see
 all archive recovery log
 Hmm... I had thought my mistake in reading your email, but it reproduce again.
 When I sat small recovery_time_delay(=3), it might work collectry.
 However, I sat long timed recovery_time_delay(=300), it didn't work.

 My reporduced operation log is under following.
 [mitsu-ko@localhost postgresql]$ bin/pgbench -T 30 -c 8 -j4  -p5432
 starting vacuum...end.
 transaction type: TPC-B (sort of)
 scaling factor: 10
 query mode: simple
 number of clients: 8
 number of threads: 4
 duration: 30 s
 number of transactions actually processed: 68704
 latency average: 3.493 ms
 tps = 2289.196747 (including connections establishing)
 tps = 2290.175129 (excluding connections establishing)
 [mitsu-ko@localhost postgresql]$ vim slave/recovery.conf
 [mitsu-ko@localhost postgresql]$ bin/pg_ctl -D slave start
 server starting
 [mitsu-ko@localhost postgresql]$ LOG:  database system was shut down in 
 recovery at 2013-12-03 10:26:41 JST
 LOG:  entering standby mode
 LOG:  consistent recovery state reached at 0/5C4D8668
 LOG:  redo starts at 0/5C4000D8
 [mitsu-ko@localhost postgresql]$ FATAL:  the database system is starting up
 FATAL:  the database system is starting up
 FATAL:  the database system is starting up
 FATAL:  the database system is starting up
 FATAL:  the database system is starting up
 [mitsu-ko@localhost postgresql]$ bin/psql -p6543
 psql: FATAL:  the database system is starting up
 [mitsu-ko@localhost postgresql]$ bin/psql -p6543
 psql: FATAL:  the database system is starting up
 I attached my postgresql.conf and recovery.conf. It will be reproduced.

So, you brought up a standby and it took more time to become consistent
because it waited on commits? That's the problem? If so, I don't think
that's a bug?

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Time-Delayed Standbys

2013-12-04 Thread Christian Kruse
Hi,

On 04/12/13 11:13, KONDO Mitsumasa wrote:
 1) Clusters
 - build master
 - build slave and attach to the master using SR and config 
 recovery_time_delay to
 1min.
 
 2) Stop de Slave
 
 3) Run some transactions on the master using pgbench to generate a lot of 
 archives
 
 4) Start the slave and connect to it using psql and in another session I can 
 see
 all archive recovery log
 Hmm... I had thought my mistake in reading your email, but it reproduce again.
 When I sat small recovery_time_delay(=3), it might work collectry.
 However, I sat long timed recovery_time_delay(=300), it didn't work.
 […]

I'm not sure if I understand your problem correctly. I try to
summarize, please correct if I'm wrong:

You created a master node and a hot standby with 300 delay. Then
you stopped the standby, did the pgbench and startet the hot standby
again. It did not get in line with the master. Is this correct?

I don't see a problem here… the standby should not be in sync with the
master, it should be delayed. I did step by step what you did and
after 50 minutes (300ms) the standby was at the same level the
master was.

Did I missunderstand you?

Regards,
 Christian Kruse

-- 
 Christian Kruse   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services



pgp7HACTkLsby.pgp
Description: PGP signature


Re: [HACKERS] Time-Delayed Standbys

2013-12-04 Thread Mitsumasa KONDO
2013/12/4 Andres Freund and...@2ndquadrant.com

 On 2013-12-04 11:13:58 +0900, KONDO Mitsumasa wrote:
  4) Start the slave and connect to it using psql and in another session
 I can see
  all archive recovery log
  Hmm... I had thought my mistake in reading your email, but it reproduce
 again.
  When I sat small recovery_time_delay(=3), it might work collectry.
  However, I sat long timed recovery_time_delay(=300), it didn't work.

  My reporduced operation log is under following.
  [mitsu-ko@localhost postgresql]$ bin/pgbench -T 30 -c 8 -j4  -p5432
  starting vacuum...end.
  transaction type: TPC-B (sort of)
  scaling factor: 10
  query mode: simple
  number of clients: 8
  number of threads: 4
  duration: 30 s
  number of transactions actually processed: 68704
  latency average: 3.493 ms
  tps = 2289.196747 (including connections establishing)
  tps = 2290.175129 (excluding connections establishing)
  [mitsu-ko@localhost postgresql]$ vim slave/recovery.conf
  [mitsu-ko@localhost postgresql]$ bin/pg_ctl -D slave start
  server starting
  [mitsu-ko@localhost postgresql]$ LOG:  database system was shut down
 in recovery at 2013-12-03 10:26:41 JST
  LOG:  entering standby mode
  LOG:  consistent recovery state reached at 0/5C4D8668
  LOG:  redo starts at 0/5C4000D8
  [mitsu-ko@localhost postgresql]$ FATAL:  the database system is
 starting up
  FATAL:  the database system is starting up
  FATAL:  the database system is starting up
  FATAL:  the database system is starting up
  FATAL:  the database system is starting up
  [mitsu-ko@localhost postgresql]$ bin/psql -p6543
  psql: FATAL:  the database system is starting up
  [mitsu-ko@localhost postgresql]$ bin/psql -p6543
  psql: FATAL:  the database system is starting up
  I attached my postgresql.conf and recovery.conf. It will be reproduced.

 So, you brought up a standby and it took more time to become consistent
 because it waited on commits? That's the problem? If so, I don't think
 that's a bug?

When it happened, psql cannot connect standby server at all. I think this
behavior is not good.
It should only delay recovery position and can seen old delay table data.
Cannot connect server is not hoped behavior.
If you think this behavior is the best, I will set ready for commiter. And
commiter will fix it better.

Rregards,
--
Mitsumasa KONDO
NTT Open Source Software Center


Re: [HACKERS] Time-Delayed Standbys

2013-12-04 Thread Andres Freund
On 2013-12-04 22:47:47 +0900, Mitsumasa KONDO wrote:
 2013/12/4 Andres Freund and...@2ndquadrant.com
 When it happened, psql cannot connect standby server at all. I think this
 behavior is not good.
 It should only delay recovery position and can seen old delay table data.

That doesn't sound like a good plan - even if the clients cannot connect
yet, you can still promote the server. Just not taking delay into
consideration at that point seems like it would possibly surprise users
rather badly in situations they really cannot use such surprises.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Time-Delayed Standbys

2013-12-04 Thread Mitsumasa KONDO
2013/12/4 Christian Kruse christ...@2ndquadrant.com

 You created a master node and a hot standby with 300 delay. Then
 you stopped the standby, did the pgbench and startet the hot standby
 again. It did not get in line with the master. Is this correct?

No. First, I start master, and execute pgbench. Second, I start standby
with 300ms(50min) delay.
Then it cannot connect standby server by psql at all. I'm not sure why
standby did not start.
It might because delay feature is disturbed in REDO loop when first standby
start-up.


 I don't see a problem here… the standby should not be in sync with the
 master, it should be delayed. I did step by step what you did and
 after 50 minutes (300ms) the standby was at the same level the
 master was.

I think we can connect standby server any time, nevertheless with delay
option.


 Did I missunderstand you?

I'm not sure... You might right or another best way might be existed.

Regards,
--
Mitsumasa KONDO
NTT Open Source Software Center


Re: [HACKERS] Time-Delayed Standbys

2013-12-04 Thread Andres Freund
Hi,

On 2013-12-03 19:33:16 +, Simon Riggs wrote:
  - compute recoveryUntilDelayTime in XLOG_XACT_COMMIT and
  XLOG_XACT_COMMIT_COMPACT checks
 
 Why just those? Why not aborts and restore points also?

What would the advantage of waiting on anything but commits be? If it's
not a commit, the action won't change the state of the database (yesyes,
there are exceptions, but those don't have a timestamp)...

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Time-Delayed Standbys

2013-12-04 Thread Mitsumasa KONDO
2013/12/4 Andres Freund and...@2ndquadrant.com

 On 2013-12-04 22:47:47 +0900, Mitsumasa KONDO wrote:
  2013/12/4 Andres Freund and...@2ndquadrant.com
  When it happened, psql cannot connect standby server at all. I think this
  behavior is not good.
  It should only delay recovery position and can seen old delay table data.

 That doesn't sound like a good plan - even if the clients cannot connect
 yet, you can still promote the server.

I'm not sure your argument, but does a purpose of this patch slip off?

Just not taking delay into
 consideration at that point seems like it would possibly surprise users
 rather badly in situations they really cannot use such surprises.

Hmm... I think user will be surprised...

 I think it is easy to fix behavior using recovery flag.
So we had better to wait for other comments.

Regards,
--
Mitsumasa KONDO
NTT Open Source Software Center


Re: [HACKERS] Time-Delayed Standbys

2013-12-04 Thread Kevin Grittner
Robert Haas robertmh...@gmail.com wrote:

 So, I proposed this patch previously and I still think it's a
 good idea, but it got voted down on the grounds that it didn't
 deal with clock drift.  I view that as insufficient reason to
 reject the feature, but others disagreed.  Unless some of those
 people have changed their minds, I don't think this patch has
 much future here.

There are many things that a system admin can get wrong.  Failing
to supply this feature because the sysadmin might not be running
ntpd (or equivalent) correctly seems to me to be like not having
the software do fsync because the sysadmin might not have turned
off write-back buffering on drives without persistent storage. 
Either way, poor system management can defeat the feature.  Either
way, I see no reason to withhold the feature from those who manage
their systems in a sane fashion.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Time-Delayed Standbys

2013-12-04 Thread Christian Kruse
Hi,

On 04/12/13 07:22, Kevin Grittner wrote:
 There are many things that a system admin can get wrong.  Failing
 to supply this feature because the sysadmin might not be running
 ntpd (or equivalent) correctly seems to me to be like not having
 the software do fsync because the sysadmin might not have turned
 off write-back buffering on drives without persistent storage. 
 Either way, poor system management can defeat the feature.  Either
 way, I see no reason to withhold the feature from those who manage
 their systems in a sane fashion.

I agree. But maybe we should add a warning in the documentation about
time syncing?

Greetings,
 CK

-- 
 Christian Kruse   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services



pgp929ckT_fsN.pgp
Description: PGP signature


Re: [HACKERS] Time-Delayed Standbys

2013-12-04 Thread Peter Eisentraut
src/backend/access/transam/xlog.c:5889: trailing whitespace.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Time-Delayed Standbys

2013-12-04 Thread Simon Riggs
On 3 December 2013 18:46, Robert Haas robertmh...@gmail.com wrote:
 On Tue, Dec 3, 2013 at 12:36 PM, Fabrízio de Royes Mello
 fabriziome...@gmail.com wrote:
 On Tue, Dec 3, 2013 at 2:33 PM, Christian Kruse christ...@2ndquadrant.com
 wrote:

 Hi Fabrizio,

 looks good to me. I did some testing on 9.2.4, 9.2.5 and HEAD. It
 applies and compiles w/o errors or warnings. I set up a master and two
 hot standbys replicating from the master, one with 5 minutes delay and
 one without delay. After that I created a new database and generated
 some test data:

 CREATE TABLE test (val INTEGER);
 INSERT INTO test (val) (SELECT * FROM generate_series(0, 100));

 The non-delayed standby nearly instantly had the data replicated, the
 delayed standby was replicated after exactly 5 minutes. I did not
 notice any problems, errors or warnings.


 Thanks for your review Christian...

 So, I proposed this patch previously and I still think it's a good
 idea, but it got voted down on the grounds that it didn't deal with
 clock drift.  I view that as insufficient reason to reject the
 feature, but others disagreed.  Unless some of those people have
 changed their minds, I don't think this patch has much future here.

I had that objection and others. Since then many people have requested
this feature and have persuaded me that this is worth having and that
my objections are minor points. I now agree with the need for the
feature, almost as written.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Time-Delayed Standbys

2013-12-03 Thread Christian Kruse
Hi Fabrizio,

looks good to me. I did some testing on 9.2.4, 9.2.5 and HEAD. It
applies and compiles w/o errors or warnings. I set up a master and two
hot standbys replicating from the master, one with 5 minutes delay and
one without delay. After that I created a new database and generated
some test data:

CREATE TABLE test (val INTEGER);
INSERT INTO test (val) (SELECT * FROM generate_series(0, 100));

The non-delayed standby nearly instantly had the data replicated, the
delayed standby was replicated after exactly 5 minutes. I did not
notice any problems, errors or warnings.

Greetings,
 CK

-- 
 Christian Kruse   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services



pgpok2vtj3rMM.pgp
Description: PGP signature


Re: [HACKERS] Time-Delayed Standbys

2013-12-03 Thread Fabrízio de Royes Mello
On Tue, Dec 3, 2013 at 2:33 PM, Christian Kruse
christ...@2ndquadrant.comwrote:

 Hi Fabrizio,

 looks good to me. I did some testing on 9.2.4, 9.2.5 and HEAD. It
 applies and compiles w/o errors or warnings. I set up a master and two
 hot standbys replicating from the master, one with 5 minutes delay and
 one without delay. After that I created a new database and generated
 some test data:

 CREATE TABLE test (val INTEGER);
 INSERT INTO test (val) (SELECT * FROM generate_series(0, 100));

 The non-delayed standby nearly instantly had the data replicated, the
 delayed standby was replicated after exactly 5 minutes. I did not
 notice any problems, errors or warnings.



Thanks for your review Christian...

Regards,

-- 
Fabrízio de Royes Mello
Consultoria/Coaching PostgreSQL
 Timbira: http://www.timbira.com.br
 Blog sobre TI: http://fabriziomello.blogspot.com
 Perfil Linkedin: http://br.linkedin.com/in/fabriziomello
 Twitter: http://twitter.com/fabriziomello


Re: [HACKERS] Time-Delayed Standbys

2013-12-03 Thread Robert Haas
On Tue, Dec 3, 2013 at 12:36 PM, Fabrízio de Royes Mello
fabriziome...@gmail.com wrote:
 On Tue, Dec 3, 2013 at 2:33 PM, Christian Kruse christ...@2ndquadrant.com
 wrote:

 Hi Fabrizio,

 looks good to me. I did some testing on 9.2.4, 9.2.5 and HEAD. It
 applies and compiles w/o errors or warnings. I set up a master and two
 hot standbys replicating from the master, one with 5 minutes delay and
 one without delay. After that I created a new database and generated
 some test data:

 CREATE TABLE test (val INTEGER);
 INSERT INTO test (val) (SELECT * FROM generate_series(0, 100));

 The non-delayed standby nearly instantly had the data replicated, the
 delayed standby was replicated after exactly 5 minutes. I did not
 notice any problems, errors or warnings.


 Thanks for your review Christian...

So, I proposed this patch previously and I still think it's a good
idea, but it got voted down on the grounds that it didn't deal with
clock drift.  I view that as insufficient reason to reject the
feature, but others disagreed.  Unless some of those people have
changed their minds, I don't think this patch has much future here.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Time-Delayed Standbys

2013-12-03 Thread Joshua D. Drake


On 12/03/2013 10:46 AM, Robert Haas wrote:


On Tue, Dec 3, 2013 at 12:36 PM, Fabrízio de Royes Mello
fabriziome...@gmail.com wrote:

On Tue, Dec 3, 2013 at 2:33 PM, Christian Kruse christ...@2ndquadrant.com
wrote:


Hi Fabrizio,

looks good to me. I did some testing on 9.2.4, 9.2.5 and HEAD. It
applies and compiles w/o errors or warnings. I set up a master and two
hot standbys replicating from the master, one with 5 minutes delay and
one without delay. After that I created a new database and generated
some test data:

CREATE TABLE test (val INTEGER);
INSERT INTO test (val) (SELECT * FROM generate_series(0, 100));

The non-delayed standby nearly instantly had the data replicated, the
delayed standby was replicated after exactly 5 minutes. I did not
notice any problems, errors or warnings.



Thanks for your review Christian...


So, I proposed this patch previously and I still think it's a good
idea, but it got voted down on the grounds that it didn't deal with
clock drift.  I view that as insufficient reason to reject the
feature, but others disagreed.  Unless some of those people have
changed their minds, I don't think this patch has much future here.



I would agree that it is a good idea.

Joshua D. Drake

--
Command Prompt, Inc. - http://www.commandprompt.com/  509-416-6579
PostgreSQL Support, Training, Professional Services and Development
High Availability, Oracle Conversion, Postgres-XC, @cmdpromptinc
For my dreams of your image that blossoms
   a rose in the deeps of my heart. - W.B. Yeats


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Time-Delayed Standbys

2013-12-03 Thread Andres Freund
On 2013-12-03 13:46:28 -0500, Robert Haas wrote:
 On Tue, Dec 3, 2013 at 12:36 PM, Fabrízio de Royes Mello
 fabriziome...@gmail.com wrote:
  On Tue, Dec 3, 2013 at 2:33 PM, Christian Kruse christ...@2ndquadrant.com
  wrote:
 
  Hi Fabrizio,
 
  looks good to me. I did some testing on 9.2.4, 9.2.5 and HEAD. It
  applies and compiles w/o errors or warnings. I set up a master and two
  hot standbys replicating from the master, one with 5 minutes delay and
  one without delay. After that I created a new database and generated
  some test data:
 
  CREATE TABLE test (val INTEGER);
  INSERT INTO test (val) (SELECT * FROM generate_series(0, 100));
 
  The non-delayed standby nearly instantly had the data replicated, the
  delayed standby was replicated after exactly 5 minutes. I did not
  notice any problems, errors or warnings.
 
 
  Thanks for your review Christian...
 
 So, I proposed this patch previously and I still think it's a good
 idea, but it got voted down on the grounds that it didn't deal with
 clock drift.  I view that as insufficient reason to reject the
 feature, but others disagreed.

I really fail to see why clock drift should be this patch's
responsibility. It's not like the world would go under^W data corruption
would ensue if the clocks drift. Your standby would get delayed
imprecisely. Big deal. From what I know of potential users of this
feature, they would set it to at the very least 30min - that's WAY above
the range for acceptable clock-drift on servers.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Time-Delayed Standbys

2013-12-03 Thread Simon Riggs
On 18 October 2013 19:03, Fabrízio de Royes Mello
fabriziome...@gmail.com wrote:

 The attached patch is a continuation of Robert's work [1].

Reviewing v2...


 I made some changes:
 - use of Latches instead of pg_usleep, so we don't have to wakeup regularly.

OK

 - call HandleStartupProcInterrupts() before CheckForStandbyTrigger() because
 might change the trigger file's location

OK

 - compute recoveryUntilDelayTime in XLOG_XACT_COMMIT and
 XLOG_XACT_COMMIT_COMPACT checks

Why just those? Why not aborts and restore points also?


 - don't care about clockdrift because it's an admin problem.

Few minor points on things

* The code with comment Clear any previous recovery delay time is in
wrong place, move down or remove completely. Setting the delay to zero
doesn't prevent calling recoveryDelay(), so that logic looks wrong
anyway.

* The loop exit in recoveryDelay() is inelegant, should break if = 0

* There's a spelling mistake in sample

* The patch has whitespace in one place

and one important point...

* The delay loop happens AFTER we check for a pause. Which means if
the user notices a problem on a commit, then hits pause button on the
standby, the pause will have no effect and the next commit will be
applied anyway. Maybe just one commit, but its an off by one error
that removes the benefit of the patch. So I think we need to test this
again after we finish delaying

if (xlogctl-recoveryPause)
  recoveryPausesHere();


We need to explain in the docs that this is intended only for use in a
live streaming deployment. It will have little or no meaning in a
PITR.

I think recovery_time_delay should be called
something_apply_delay
to highlight the point that it is the apply of records that is
delayed, not the receipt. And hence the need to document that sync rep
is NOT slowed down by setting this value.

And to make the name consistent with other parameters, I suggest
min_standby_apply_delay

We also need to document caveats about the patch, in that it only
delays on timestamped WAL records and other records may be applied
sooner than the delay in some circumstances, so it is not a way to
avoid all cancellations.

We also need to document the behaviour of the patch is to apply all
data received as quickly as possible once triggered, so the specified
delay does not slow down promoting the server to a master. That might
also be seen as a negative behaviour, since promoting the master
effectively sets recovery_time_delay to zero.

I will handle the additional documentation, if you can update the
patch with the main review comments. Thanks.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Time-Delayed Standbys

2013-12-03 Thread KONDO Mitsumasa

(2013/11/30 5:34), Fabrízio de Royes Mello wrote:

On Fri, Nov 29, 2013 at 5:49 AM, KONDO Mitsumasa kondo.mitsum...@lab.ntt.co.jp
mailto:kondo.mitsum...@lab.ntt.co.jp wrote:
  * Problem1
  Your patch does not code recovery.conf.sample about recovery_time_delay.
  Please add it.
Fixed.

OK. It seems no problem.


  * Problem2
  When I set time-delayed standby and start standby server, I cannot access
stanby server by psql. It is because PG is in first starting recovery which
cannot access by psql. I think that time-delayed standby is only delayed 
recovery
position, it must not affect other functionality.
 
  I didn't test recoevery in master server with recovery_time_delay. If you 
have
detail test result of these cases, please send me.
 
Well, I could not reproduce the problem that you described.

I run the following test:

1) Clusters
- build master
- build slave and attach to the master using SR and config recovery_time_delay 
to
1min.

2) Stop de Slave

3) Run some transactions on the master using pgbench to generate a lot of 
archives

4) Start the slave and connect to it using psql and in another session I can see
all archive recovery log

Hmm... I had thought my mistake in reading your email, but it reproduce again.
When I sat small recovery_time_delay(=3), it might work collectry. However, I 
sat long timed recovery_time_delay(=300), it didn't work.


My reporduced operation log is under following.

[mitsu-ko@localhost postgresql]$ bin/pgbench -T 30 -c 8 -j4  -p5432
starting vacuum...end.
transaction type: TPC-B (sort of)
scaling factor: 10
query mode: simple
number of clients: 8
number of threads: 4
duration: 30 s
number of transactions actually processed: 68704
latency average: 3.493 ms
tps = 2289.196747 (including connections establishing)
tps = 2290.175129 (excluding connections establishing)
[mitsu-ko@localhost postgresql]$ vim slave/recovery.conf
[mitsu-ko@localhost postgresql]$ bin/pg_ctl -D slave start
server starting
[mitsu-ko@localhost postgresql]$ LOG:  database system was shut down in 
recovery at 2013-12-03 10:26:41 JST
LOG:  entering standby mode
LOG:  consistent recovery state reached at 0/5C4D8668
LOG:  redo starts at 0/5C4000D8
[mitsu-ko@localhost postgresql]$ FATAL:  the database system is starting up
FATAL:  the database system is starting up
FATAL:  the database system is starting up
FATAL:  the database system is starting up
FATAL:  the database system is starting up
[mitsu-ko@localhost postgresql]$ bin/psql -p6543
psql: FATAL:  the database system is starting up
[mitsu-ko@localhost postgresql]$ bin/psql -p6543
psql: FATAL:  the database system is starting up

I attached my postgresql.conf and recovery.conf. It will be reproduced.

I think that your patch should be needed recovery flags which are like 
ArchiveRecoveryRequested and InArchiveRecovery etc. It is because time-delayed 
standy works only replication situasion. And I hope that it isn't bad in startup 
standby server and archive recovery. Is it wrong with your image? I think this 
patch have a lot of potential, however I think that standby functionality is more 
important than this feature. And we might need to discuss that how behavior is 
best in this patch.


Regards,
--
Mitsumasa KONDO
NTT Open Source Software Center


conf.tar.gz
Description: GNU Zip compressed data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Time-Delayed Standbys

2013-12-03 Thread KONDO Mitsumasa

(2013/12/04 4:00), Andres Freund wrote:

On 2013-12-03 13:46:28 -0500, Robert Haas wrote:

On Tue, Dec 3, 2013 at 12:36 PM, Fabrízio de Royes Mello
fabriziome...@gmail.com wrote:

On Tue, Dec 3, 2013 at 2:33 PM, Christian Kruse christ...@2ndquadrant.com
wrote:


Hi Fabrizio,

looks good to me. I did some testing on 9.2.4, 9.2.5 and HEAD. It
applies and compiles w/o errors or warnings. I set up a master and two
hot standbys replicating from the master, one with 5 minutes delay and
one without delay. After that I created a new database and generated
some test data:

CREATE TABLE test (val INTEGER);
INSERT INTO test (val) (SELECT * FROM generate_series(0, 100));

The non-delayed standby nearly instantly had the data replicated, the
delayed standby was replicated after exactly 5 minutes. I did not
notice any problems, errors or warnings.



Thanks for your review Christian...


So, I proposed this patch previously and I still think it's a good
idea, but it got voted down on the grounds that it didn't deal with
clock drift.  I view that as insufficient reason to reject the
feature, but others disagreed.


I really fail to see why clock drift should be this patch's
responsibility.  It's not like the world would go under^W data corruption
would ensue if the clocks drift. Your standby would get delayed
imprecisely. Big deal. From what I know of potential users of this
feature, they would set it to at the very least 30min - that's WAY above
the range for acceptable clock-drift on servers.

Yes. I think that purpose of this patch is long time delay in standby server,
and not for little bit careful timing delay.

Regards,
--
Mitsumasa KONDO
NTT Open Source Software Center


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Time-Delayed Standbys

2013-11-29 Thread Fabrízio de Royes Mello
On Fri, Nov 29, 2013 at 5:49 AM, KONDO Mitsumasa 
kondo.mitsum...@lab.ntt.co.jp wrote:

 Hi Royes,

 I'm sorry for my late review...


No problem...


 I feel potential of your patch in PG replication function, and it might
be something useful for all people. I check your patch and have some
comment for improvement. I haven't executed detail of unexpected sutuation
yet. But I think that under following problem2 is important functionality
problem. So I ask you to solve the problem in first.

 * Regress test
 No problem.

 * Problem1
 Your patch does not code recovery.conf.sample about recovery_time_delay.
 Please add it.


Fixed.


 * Problem2
 When I set time-delayed standby and start standby server, I cannot access
stanby server by psql. It is because PG is in first starting recovery which
cannot access by psql. I think that time-delayed standby is only delayed
recovery position, it must not affect other functionality.

 I didn't test recoevery in master server with recovery_time_delay. If you
have detail test result of these cases, please send me.


Well, I could not reproduce the problem that you described.

I run the following test:

1) Clusters
- build master
- build slave and attach to the master using SR and config
recovery_time_delay to 1min.

2) Stop de Slave

3) Run some transactions on the master using pgbench to generate a lot of
archives

4) Start the slave and connect to it using psql and in another session I
can see all archive recovery log


 My first easy review of your patch is that all.


Thanks.

Regards,

--
Fabrízio de Royes Mello
Consultoria/Coaching PostgreSQL
 Timbira: http://www.timbira.com.br
 Blog sobre TI: http://fabriziomello.blogspot.com
 Perfil Linkedin: http://br.linkedin.com/in/fabriziomello
 Twitter: http://twitter.com/fabriziomello
diff --git a/doc/src/sgml/recovery-config.sgml b/doc/src/sgml/recovery-config.sgml
index c0c543e..641c9c6 100644
--- a/doc/src/sgml/recovery-config.sgml
+++ b/doc/src/sgml/recovery-config.sgml
@@ -135,6 +135,27 @@ restore_command = 'copy C:\\server\\archivedir\\%f %p'  # Windows
   /listitem
  /varlistentry
 
+ varlistentry id=recovery-time-delay xreflabel=recovery_time_delay
+  termvarnamerecovery_time_delay/varname (typeinteger/type)/term
+  indexterm
+primaryvarnamerecovery_time_delay/ recovery parameter/primary
+  /indexterm
+  listitem
+   para
+Specifies the amount of time (in milliseconds, if no unit is specified)
+which recovery of transaction commits should lag the master.  This
+parameter allows creation of a time-delayed standby.  For example, if
+you set this parameter to literal5min/literal, the standby will
+replay each transaction commit only when the system time on the standby
+is at least five minutes past the commit time reported by the master.
+   /para
+   para
+Note that if the master and standby system clocks are not synchronized,
+this might lead to unexpected results.
+   /para
+  /listitem
+ /varlistentry
+
 /variablelist
 
   /sect1
diff --git a/src/backend/access/transam/recovery.conf.sample b/src/backend/access/transam/recovery.conf.sample
index 5acfa57..97cc7af 100644
--- a/src/backend/access/transam/recovery.conf.sample
+++ b/src/backend/access/transam/recovery.conf.sample
@@ -123,6 +123,17 @@
 #
 #trigger_file = ''
 #
+# recovery_time_delay
+#
+# By default, a standby server keeps restoring XLOG records from the
+# primary as soon as possible. If you want to delay the replay of
+# commited transactions from the master, specify a recovery time delay.
+# For example, if you set this parameter to 5min, the standby will replay
+# each transaction commit only whe the system time on the standby is least
+# five minutes past the commit time reported by the master.
+#
+#recovery_time_delay = 0
+#
 #---
 # HOT STANDBY PARAMETERS
 #---
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index de19d22..714b1bd 100755
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -218,6 +218,8 @@ static bool recoveryPauseAtTarget = true;
 static TransactionId recoveryTargetXid;
 static TimestampTz recoveryTargetTime;
 static char *recoveryTargetName;
+static int recovery_time_delay = 0;
+static TimestampTz recoveryDelayUntilTime;
 
 /* options taken from recovery.conf for XLOG streaming */
 static bool StandbyModeRequested = false;
@@ -730,6 +732,7 @@ static void readRecoveryCommandFile(void);
 static void exitArchiveRecovery(TimeLineID endTLI, XLogSegNo endLogSegNo);
 static bool recoveryStopsHere(XLogRecord *record, bool *includeThis);
 static void recoveryPausesHere(void);
+static void recoveryDelay(void);
 static void SetLatestXTime(TimestampTz xtime);
 static void SetCurrentChunkStartTime(TimestampTz 

Re: [HACKERS] Time-Delayed Standbys

2013-11-28 Thread KONDO Mitsumasa

Hi Royes,

I'm sorry for my late review...

I feel potential of your patch in PG replication function, and it might be 
something useful for all people. I check your patch and have some comment for 
improvement. I haven't executed detail of unexpected sutuation yet. But I think 
that under following problem2 is important functionality problem. So I ask you to 
solve the problem in first.


* Regress test
No problem.

* Problem1
Your patch does not code recovery.conf.sample about recovery_time_delay.
Please add it.

* Problem2
When I set time-delayed standby and start standby server, I cannot access stanby 
server by psql. It is because PG is in first starting recovery which cannot 
access by psql. I think that time-delayed standby is only delayed recovery 
position, it must not affect other functionality.


I didn't test recoevery in master server with recovery_time_delay. If you have 
detail test result of these cases, please send me.


My first easy review of your patch is that all.

Regards,
--
Mitsumasa KONDO
NTT Open Source Software Center


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] time-delayed standbys

2011-07-02 Thread Simon Riggs
On Thu, Jun 30, 2011 at 6:25 PM, Robert Haas robertmh...@gmail.com wrote:

 I think the time problems are more complex than said. The patch relies
 upon transaction completion times, but not all WAL records have a time
 attached to them. Plus you only used commits anyway, not sure why.

 For the same reason we do that with the recovery_target_* code -
 replaying something like a heap insert or heap update doesn't change
 the user-visible state of the database, because the records aren't
 visible anyway until the commit record is replayed.

 Some actions aren't even transactional, such as DROP DATABASE, amongst

 Good point.  We'd probably need to add a timestamp to the drop
 database record, as that's a case that people would likely want to
 defend against with this feature.

 others. Consecutive records can be hours apart, so it would be
 possible to delay on some WAL records but then replay records that
 happened minutes ago, then wait hours for the next apply. So this
 patch doesn't do what it claims in all cases.

You misread my words above, neglecting the amongst others part.

I don't believe you'll be able to do this just by relying on
timestamps on WAL records because not all records carry timestamps and
we're not going to add them just for this.

It's easier to make this work usefully using pg_standby.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] time-delayed standbys

2011-06-30 Thread Simon Riggs
On Thu, Jun 30, 2011 at 2:56 AM, Robert Haas robertmh...@gmail.com wrote:
 On Wed, Jun 29, 2011 at 9:54 PM, Josh Berkus j...@agliodbs.com wrote:
 I am not sure exactly how walreceiver handles it if the disk is full.
 I assume it craps out and eventually retries, so probably what will
 happen is that, after the standby's pg_xlog directory fills up,
 walreceiver will sit there and error out until replay advances enough
 to remove a WAL file and thus permit some more data to be streamed.

 Nope, it gets stuck and stops there.  Replay doesn't advance unless you
 can somehow clear out some space manually; if the disk is full, the disk
 is full, and PostgreSQL doesn't remove WAL files without being able to
 write files first.

 Manual (or scripted) intervention is always necessary if you reach disk
 100% full.

 Wow, that's a pretty crappy failure mode... but I don't think we need
 to fix it just on account of this patch.  It would be nice to fix, of
 course.

How is that different to running out of space in the main database?

If I try to pour a pint of milk into a small cup, I don't blame the cup.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] time-delayed standbys

2011-06-30 Thread Simon Riggs
On Wed, Jun 29, 2011 at 7:11 PM, Robert Haas robertmh...@gmail.com wrote:

 I don't really see how that's any different from what happens now.  If
 (for whatever reason) the master is generating WAL faster than a
 streaming standby can replay it, then the excess WAL is going to pile
 up someplace, and you might run out of disk space.   Time-delaying the
 standby creates an additional way for that to happen, but I don't
 think it's an entirely new problem.

The only way to control this is with a time delay that can be changed
while the server is running. A recovery.conf parameter doesn't allow
that, so another way is preferable.

I think the time problems are more complex than said. The patch relies
upon transaction completion times, but not all WAL records have a time
attached to them. Plus you only used commits anyway, not sure why.
Some actions aren't even transactional, such as DROP DATABASE, amongst
others. Consecutive records can be hours apart, so it would be
possible to delay on some WAL records but then replay records that
happened minutes ago, then wait hours for the next apply. So this
patch doesn't do what it claims in all cases.

Similar discussion on max_standby_delay covered exactly that ground
and went on for weeks in 9.0. IIRC I presented the same case you just
did and we agreed in the end that was not acceptable. I'm not going to
repeat it. Please check the archives.

So, again +1 for the feature, but -1 for the currently proposed
implementation, based upon review.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] time-delayed standbys

2011-06-30 Thread Josh Berkus
On 6/30/11 2:00 AM, Simon Riggs wrote:
 Manual (or scripted) intervention is always necessary if you reach disk
  100% full.
 
  Wow, that's a pretty crappy failure mode... but I don't think we need
  to fix it just on account of this patch.  It would be nice to fix, of
  course.
 How is that different to running out of space in the main database?
 
 If I try to pour a pint of milk into a small cup, I don't blame the cup.

I have to agree with Simon here.  ;-)

We can do some things to make this easier for administrators, but
there's no way to solve the problem.  And the things we could do would
have to be advanced optional modes which aren't on by default, so they
wouldn't really help the DBA with poor planning skills.  Here's my
suggestions:

1) Have a utility (pg_archivecleanup?) which checks if we have more than
a specific settings's worth of archive_logs, and breaks replication and
deletes the archive logs if we hit that number.  This would also require
some way for the standby to stop replicating *without* becoming a
standalone server, which I don't think we currently have.

2) Have a setting where, regardless of standby_delay settings, the
standby will interrupt any running queries and start applying logs as
fast as possible if it hits a certain number of unapplied archive logs.
 Of course, given the issues we had with standby_delay, I'm not sure I
want to complicate it further.

I think we've already fixed the biggest issue in 9.1, since we now have
a limit on the number of WALs the master will keep if archiving is
failing ... yes?  That's the only big *avoidable* failure mode we have,
where a failing standby effectively shuts down the master.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] time-delayed standbys

2011-06-30 Thread Robert Haas
On Thu, Jun 30, 2011 at 1:00 PM, Josh Berkus j...@agliodbs.com wrote:
 On 6/30/11 2:00 AM, Simon Riggs wrote:
 Manual (or scripted) intervention is always necessary if you reach disk
  100% full.
 
  Wow, that's a pretty crappy failure mode... but I don't think we need
  to fix it just on account of this patch.  It would be nice to fix, of
  course.
 How is that different to running out of space in the main database?

 If I try to pour a pint of milk into a small cup, I don't blame the cup.

 I have to agree with Simon here.  ;-)

 We can do some things to make this easier for administrators, but
 there's no way to solve the problem.  And the things we could do would
 have to be advanced optional modes which aren't on by default, so they
 wouldn't really help the DBA with poor planning skills.  Here's my
 suggestions:

 1) Have a utility (pg_archivecleanup?) which checks if we have more than
 a specific settings's worth of archive_logs, and breaks replication and
 deletes the archive logs if we hit that number.  This would also require
 some way for the standby to stop replicating *without* becoming a
 standalone server, which I don't think we currently have.

 2) Have a setting where, regardless of standby_delay settings, the
 standby will interrupt any running queries and start applying logs as
 fast as possible if it hits a certain number of unapplied archive logs.
  Of course, given the issues we had with standby_delay, I'm not sure I
 want to complicate it further.

 I think we've already fixed the biggest issue in 9.1, since we now have
 a limit on the number of WALs the master will keep if archiving is
 failing ... yes?  That's the only big *avoidable* failure mode we have,
 where a failing standby effectively shuts down the master.

I'm not sure we changed anything in this area for 9.1.  Am I wrong?
wal_keep_segments was present in 9.0.  Using that instead of archiving
is a reasonable way to bound the amount of disk space that can get
used, at the cost of possibly needing to rebuild the standby if things
get too far behind.  Of course, in any version, you could also use an
archive_command that will remove old files to make space if the disk
is full, with the same downside: if the standby isn't done with those
files, you're now in for a rebuild.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] time-delayed standbys

2011-06-30 Thread Robert Haas
On Thu, Jun 30, 2011 at 6:45 AM, Simon Riggs si...@2ndquadrant.com wrote:
 The only way to control this is with a time delay that can be changed
 while the server is running. A recovery.conf parameter doesn't allow
 that, so another way is preferable.

True.  We've talked about making the recovery.conf parameters into
GUCs, which would address that concern (and some others).

 I think the time problems are more complex than said. The patch relies
 upon transaction completion times, but not all WAL records have a time
 attached to them. Plus you only used commits anyway, not sure why.

For the same reason we do that with the recovery_target_* code -
replaying something like a heap insert or heap update doesn't change
the user-visible state of the database, because the records aren't
visible anyway until the commit record is replayed.

 Some actions aren't even transactional, such as DROP DATABASE, amongst

Good point.  We'd probably need to add a timestamp to the drop
database record, as that's a case that people would likely want to
defend against with this feature.

 others. Consecutive records can be hours apart, so it would be
 possible to delay on some WAL records but then replay records that
 happened minutes ago, then wait hours for the next apply. So this
 patch doesn't do what it claims in all cases.

 Similar discussion on max_standby_delay covered exactly that ground
 and went on for weeks in 9.0. IIRC I presented the same case you just
 did and we agreed in the end that was not acceptable. I'm not going to
 repeat it. Please check the archives.

I think this case is a bit different.  First, max_standby_delay is
relevant for any installation using Hot Standby, whereas this is a
feature that specifically involves time.  Saying that you have to have
time synchronization for Hot Standby to work as designed is more of a
burden than saying you need time synchronization *if you want to use
the time-delayed recovery feature*.  Second, and maybe more
importantly, no one has come up with an idea for how to make this work
reliably in the presence of time skew.  Perhaps we could provide a
simple time-skew correction feature that would work in the streaming
case (though probably not nearly as well as running ntpd), but as I
understand your argument, you're saying that most people will want to
use this with archiving.  I don't see how to make that work without
time synchronization.  In the max_standby_delay case, the requirement
is that queries not get cancelled too aggressively while at the same
time letting the standby get too far behind the master, which leaves
some flexibility in terms of how we actually make that trade-off, and
we eventually found a way that didn't require time synchronization,
which was an improvement.  But for a time-delayed standby, the
requirement at least AIUI is that the state of the standby lag the
master by a certain time interval, and I don't see any way to do that
without comparing slave timestamps with master timestamps.  If we can
find a similar clever trick here, great!  But I'm not seeing how to do
it.

Now, another option here is to give up on the idea of a time-delayed
standby altogether and instead allow the standby to lag the master by
a certain number of WAL segments or XIDs.  Of course, if we do that,
then we will not have a feature called time-delayed standbys.
Instead, we will have a feature called standbys delayed by a certain
number of WAL segments (or XIDs).  That certainly caters to some of
the same use cases, but I think it severely lacking in the usability
department.  I bet the first thing most people will do is to try to
figure out how to translate between those metrics and time, and I bet
we'll get complaints on systems where the activity load is variable
and therefore the time lag for a fixed WAL-segment lag or XID-lag is
unpredictable.  So I think keeping it defined it terms of time is the
right way forward, even though the need for external time
synchronization is, certainly, not ideal.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] time-delayed standbys

2011-06-30 Thread Josh Berkus
On 6/30/11 10:25 AM, Robert Haas wrote:
 So I think keeping it defined it terms of time is the
 right way forward, even though the need for external time
 synchronization is, certainly, not ideal.

Actually, when we last had the argument about time synchronization,
Kevin Grittner (I believe) pointed out that unsynchronized replication
servers have an assortment of other issues ... like any read query
involving now().  As the person who originally brought up this hurdle, I
felt that his argument defeated mine.

Certainly I can't see any logical way to have time delay in the absence
of clock synchronization of some kind.  Also, I kinda feel like this
discussion seems aimed at overcomplicating a feature which only a small
fraction of our users will ever use.Let's keep it as simple as possible.

As for delay on streaming replication, I'm for it.  I think that
post-9.1, thanks to pgbasebackup, the number of our users who are doing
archive log shipping is going to drop tremendously.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] time-delayed standbys

2011-06-30 Thread Kevin Grittner
Josh Berkus j...@agliodbs.com wrote:
 
 when we last had the argument about time synchronization,
 Kevin Grittner (I believe) pointed out that unsynchronized
 replication servers have an assortment of other issues ... like
 any read query involving now().
 
I don't remember making that point, although I think it's a valid
one.
 
What I'm sure I pointed out is that we have one central router which
synchronizes to a whole bunch of atomic clocks around the world
using the normal discard the outliers and average the rest
algorithm, and then *every singe server and workstation on our
network synchronizes to that router*.  Our database servers are all
running on Linux using ntpd.  Our monitoring spams us with email if
any of the clocks falls outside nominal bounds.  (It's been many
years since we had a misconfigured server which triggered that.)
 
I think doing anything in PostgreSQL around this beyond allowing
DBAs to trust their server clocks is insane.  The arguments for
using and trusting ntpd is pretty much identical to the arguments
for using and trusting the OS file systems.
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] time-delayed standbys

2011-06-30 Thread Josh Berkus
Kevin,

 I think doing anything in PostgreSQL around this beyond allowing
 DBAs to trust their server clocks is insane.  The arguments for
 using and trusting ntpd is pretty much identical to the arguments
 for using and trusting the OS file systems.

Oh, you don't want to implement our own NTP?  Coward!

;-)

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] time-delayed standbys

2011-06-30 Thread Robert Haas
On Thu, Jun 30, 2011 at 1:51 PM, Kevin Grittner
kevin.gritt...@wicourts.gov wrote:
 I think doing anything in PostgreSQL around this beyond allowing
 DBAs to trust their server clocks is insane.  The arguments for
 using and trusting ntpd is pretty much identical to the arguments
 for using and trusting the OS file systems.

Except that implementing our own file system would likely have more
benefit and be less work than implementing our own time
synchronization, at least if we want it to be reliable.

Again, I am not trying to pretend that this is any great shakes.
MySQL's version of this feature apparently does somehow compensate for
time skew, which I assume must mean that their replication works
differently than ours - inter alia, it probably requires a TCP socket
connection between the servers.  Since we don't require that, it
limits our options in this area, but also gives us more options in
other areas.  Still, if I could think of a way to do this that didn't
depend on time synchronization, then I'd be in favor of eliminating
that requirement.  I just can't; and I'm inclined to think it isn't
possible.

I wouldn't be opposed to having an option to try to detect time skew
between the master and the slave and, say, display that information in
pg_stat_replication.  It might be useful to have that data for
monitoring purposes, and it probably wouldn't even be that much code.
However, I'd be a bit hesitant to use that data to correct the
amount of time we spend waiting for time-delayed replication, because
it would doubtless be extremely imprecise compared to real time
synchronization, and considerably more error-prone.  IOW, what you
said.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] time-delayed standbys

2011-06-30 Thread Fujii Masao
On Fri, Jul 1, 2011 at 2:25 AM, Robert Haas robertmh...@gmail.com wrote:
 Some actions aren't even transactional, such as DROP DATABASE, amongst

 Good point.  We'd probably need to add a timestamp to the drop
 database record, as that's a case that people would likely want to
 defend against with this feature.

This means that recovery_target_* code would also need to deal with
DROP DATABASE case.

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] time-delayed standbys

2011-06-30 Thread Fujii Masao
On Fri, Jul 1, 2011 at 3:25 AM, Robert Haas robertmh...@gmail.com wrote:
 On Thu, Jun 30, 2011 at 1:51 PM, Kevin Grittner
 kevin.gritt...@wicourts.gov wrote:
 I think doing anything in PostgreSQL around this beyond allowing
 DBAs to trust their server clocks is insane.  The arguments for
 using and trusting ntpd is pretty much identical to the arguments
 for using and trusting the OS file systems.

 Except that implementing our own file system would likely have more
 benefit and be less work than implementing our own time
 synchronization, at least if we want it to be reliable.

 Again, I am not trying to pretend that this is any great shakes.
 MySQL's version of this feature apparently does somehow compensate for
 time skew, which I assume must mean that their replication works
 differently than ours - inter alia, it probably requires a TCP socket
 connection between the servers.  Since we don't require that, it
 limits our options in this area, but also gives us more options in
 other areas.  Still, if I could think of a way to do this that didn't
 depend on time synchronization, then I'd be in favor of eliminating
 that requirement.  I just can't; and I'm inclined to think it isn't
 possible.

 I wouldn't be opposed to having an option to try to detect time skew
 between the master and the slave and, say, display that information in
 pg_stat_replication.  It might be useful to have that data for
 monitoring purposes, and it probably wouldn't even be that much code.
 However, I'd be a bit hesitant to use that data to correct the
 amount of time we spend waiting for time-delayed replication, because
 it would doubtless be extremely imprecise compared to real time
 synchronization, and considerably more error-prone.  IOW, what you
 said.

I agree with Robert. It's difficult to implement time-synchronization feature
which can deal with all the cases, and I'm not sure if that's really
worth taking
our time.

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] time-delayed standbys

2011-06-30 Thread Jaime Casanova
Fujii Masao masao.fu...@gmail.com writes:

 On Fri, Jul 1, 2011 at 2:25 AM, Robert Haas robertmh...@gmail.com wrote:
 Some actions aren't even transactional, such as DROP DATABASE, amongst

 Good point.  We'd probably need to add a timestamp to the drop
 database record, as that's a case that people would likely want to
 defend against with this feature.

 This means that recovery_target_* code would also need to deal with
 DROP DATABASE case.


there is no problem if you use restore point names... but of course
you lose flexibility (ie: you can't restore to 5 minutes before now)

mmm... a lazy idea: can't we just create a restore point wal record
*before* we actually drop the database? then we won't need to modify
logic about recovery_target_* (if it is only DROP DATABASE maybe that's
enough about complicating code) and we can provide that protection since
9.1

-- 
Jaime Casanova www.2ndQuadrant.com
Professional PostgreSQL 
Soporte 24x7, desarrollo, capacitación y servicios

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] time-delayed standbys

2011-06-29 Thread Simon Riggs
On Wed, Jun 15, 2011 at 6:58 AM, Fujii Masao masao.fu...@gmail.com wrote:

 Or, we should
 implement new promote mode which finishes a recovery as soon as
 promote is requested (i.e., not replay all the available WAL records)?

That's not a new feature. We had it in 8.4, but it was removed.

Originally, we supported fast failover via trigger file.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] time-delayed standbys

2011-06-29 Thread Simon Riggs
On Thu, Jun 16, 2011 at 7:29 PM, Robert Haas robertmh...@gmail.com wrote:
 On Wed, Jun 15, 2011 at 1:58 AM, Fujii Masao masao.fu...@gmail.com wrote:
 When the replication connection is terminated, the standby tries to read
 WAL files from the archive. In this case, there is no walreceiver process,
 so how does the standby calculate the clock difference?

 Good question.  Also, just because we have streaming replication
 available doesn't mean that we should force people to use it.  It's
 still perfectly legit to set up a standby that only use
 archive_command and restore_command, and it would be nice if this
 feature could still work in such an environment.  I anticipate that
 most people want to use streaming replication, but a time-delayed
 standby is a good example of a case where you might decide you don't
 need it.  It could be useful to have all the WAL present (but not yet
 applied) if you're thinking you might want to promote that standby -
 but my guess is that in many cases, the time-delayed standby will be
 *in addition* to one or more regular standbys that would be the
 primary promotion candidates.  So I can see someone deciding that
 they'd rather not have the load of another walsender on the master,
 and just let the time-delayed standby read from the archive.

 Even if that were not an issue, I'm still more or less of the opinion
 that trying to solve the time synchronization problem is a rathole
 anyway.  To really solve this problem well, you're going to need the
 standby to send a message containing a timestamp, get a reply back
 from the master that contains that timestamp and a master timestamp,
 and then compute based on those two timestamps plus the reply
 timestamp the maximum and minimum possible lag between the two
 machines.  Then you're going to need to guess, based on several cycles
 of this activity, what the actual lag is, and adjust it over time (but
 not too quckly, unless of course a large manual step has occurred) as
 the clocks potentially drift apart from each other.  This is basically
 what ntpd does, except that it can be virtually guaranteed that our
 implementation will suck by comparison.  Time synchronization is
 neither easy nor our core competency, and I think trying to include it
 in this feature is going to result in a net loss of reliability.


This begs the question of why we need this feature at all, in the way proposed.

Streaming replication is designed for immediate transfer of WAL. File
based is more about storing them for some later use.

It seems strange to pollute the *immediate* transfer route with a
delay, when that is easily possible with a small patch to pg_standby
that can wait until the filetime delay is  X before returning.

The main practical problem with this is that most people's WAL
partitions aren't big enough to store the delayed WAL files, which is
why we provide the file archiving route anyway. So in practical terms
this will be unusable, or at least dangerous to use.

+1 for the feature concept, but -1 for adding this to streaming replication.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] time-delayed standbys

2011-06-29 Thread Robert Haas
On Wed, Jun 29, 2011 at 4:00 AM, Simon Riggs si...@2ndquadrant.com wrote:
 On Thu, Jun 16, 2011 at 7:29 PM, Robert Haas robertmh...@gmail.com wrote:
 On Wed, Jun 15, 2011 at 1:58 AM, Fujii Masao masao.fu...@gmail.com wrote:
 When the replication connection is terminated, the standby tries to read
 WAL files from the archive. In this case, there is no walreceiver process,
 so how does the standby calculate the clock difference?

 Good question.  Also, just because we have streaming replication
 available doesn't mean that we should force people to use it.  It's
 still perfectly legit to set up a standby that only use
 archive_command and restore_command, and it would be nice if this
 feature could still work in such an environment.  I anticipate that
 most people want to use streaming replication, but a time-delayed
 standby is a good example of a case where you might decide you don't
 need it.  It could be useful to have all the WAL present (but not yet
 applied) if you're thinking you might want to promote that standby -
 but my guess is that in many cases, the time-delayed standby will be
 *in addition* to one or more regular standbys that would be the
 primary promotion candidates.  So I can see someone deciding that
 they'd rather not have the load of another walsender on the master,
 and just let the time-delayed standby read from the archive.

 Even if that were not an issue, I'm still more or less of the opinion
 that trying to solve the time synchronization problem is a rathole
 anyway.  To really solve this problem well, you're going to need the
 standby to send a message containing a timestamp, get a reply back
 from the master that contains that timestamp and a master timestamp,
 and then compute based on those two timestamps plus the reply
 timestamp the maximum and minimum possible lag between the two
 machines.  Then you're going to need to guess, based on several cycles
 of this activity, what the actual lag is, and adjust it over time (but
 not too quckly, unless of course a large manual step has occurred) as
 the clocks potentially drift apart from each other.  This is basically
 what ntpd does, except that it can be virtually guaranteed that our
 implementation will suck by comparison.  Time synchronization is
 neither easy nor our core competency, and I think trying to include it
 in this feature is going to result in a net loss of reliability.


 This begs the question of why we need this feature at all, in the way 
 proposed.

 Streaming replication is designed for immediate transfer of WAL. File
 based is more about storing them for some later use.

 It seems strange to pollute the *immediate* transfer route with a
 delay, when that is easily possible with a small patch to pg_standby
 that can wait until the filetime delay is  X before returning.

 The main practical problem with this is that most people's WAL
 partitions aren't big enough to store the delayed WAL files, which is
 why we provide the file archiving route anyway. So in practical terms
 this will be unusable, or at least dangerous to use.

 +1 for the feature concept, but -1 for adding this to streaming replication.

As implemented, the feature will work with either streaming
replication or with file-based replication.  I don't see any value in
restricting to work ONLY with file-based replication.

Also, if we were to do it by making pg_standby wait, then the whole
thing would be much less accurate, and the delay would become much
harder to predict, because you'd be operating on the level of entire
WAL segments, rather than individual commit records.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] time-delayed standbys

2011-06-29 Thread Simon Riggs
On Wed, Jun 29, 2011 at 1:24 PM, Robert Haas robertmh...@gmail.com wrote:
 On Wed, Jun 29, 2011 at 4:00 AM, Simon Riggs si...@2ndquadrant.com wrote:
 On Thu, Jun 16, 2011 at 7:29 PM, Robert Haas robertmh...@gmail.com wrote:
 On Wed, Jun 15, 2011 at 1:58 AM, Fujii Masao masao.fu...@gmail.com wrote:
 When the replication connection is terminated, the standby tries to read
 WAL files from the archive. In this case, there is no walreceiver process,
 so how does the standby calculate the clock difference?

 Good question.  Also, just because we have streaming replication
 available doesn't mean that we should force people to use it.  It's
 still perfectly legit to set up a standby that only use
 archive_command and restore_command, and it would be nice if this
 feature could still work in such an environment.  I anticipate that
 most people want to use streaming replication, but a time-delayed
 standby is a good example of a case where you might decide you don't
 need it.  It could be useful to have all the WAL present (but not yet
 applied) if you're thinking you might want to promote that standby -
 but my guess is that in many cases, the time-delayed standby will be
 *in addition* to one or more regular standbys that would be the
 primary promotion candidates.  So I can see someone deciding that
 they'd rather not have the load of another walsender on the master,
 and just let the time-delayed standby read from the archive.

 Even if that were not an issue, I'm still more or less of the opinion
 that trying to solve the time synchronization problem is a rathole
 anyway.  To really solve this problem well, you're going to need the
 standby to send a message containing a timestamp, get a reply back
 from the master that contains that timestamp and a master timestamp,
 and then compute based on those two timestamps plus the reply
 timestamp the maximum and minimum possible lag between the two
 machines.  Then you're going to need to guess, based on several cycles
 of this activity, what the actual lag is, and adjust it over time (but
 not too quckly, unless of course a large manual step has occurred) as
 the clocks potentially drift apart from each other.  This is basically
 what ntpd does, except that it can be virtually guaranteed that our
 implementation will suck by comparison.  Time synchronization is
 neither easy nor our core competency, and I think trying to include it
 in this feature is going to result in a net loss of reliability.


 This begs the question of why we need this feature at all, in the way 
 proposed.

 Streaming replication is designed for immediate transfer of WAL. File
 based is more about storing them for some later use.

 It seems strange to pollute the *immediate* transfer route with a
 delay, when that is easily possible with a small patch to pg_standby
 that can wait until the filetime delay is  X before returning.

 The main practical problem with this is that most people's WAL
 partitions aren't big enough to store the delayed WAL files, which is
 why we provide the file archiving route anyway. So in practical terms
 this will be unusable, or at least dangerous to use.

 +1 for the feature concept, but -1 for adding this to streaming replication.

 As implemented, the feature will work with either streaming
 replication or with file-based replication.

That sounds like the exact opposite of yours and Fujii's comments
above. Please explain.

 I don't see any value in
 restricting to work ONLY with file-based replication.

As explained above, it won't work in practice because of the amount of
file space required.

Or, an alternative question: what will you do when it waits so long
that the standby runs out of disk space?

If you hard-enforce the time delay specified then you just make
replication fail under during heavy loads.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] time-delayed standbys

2011-06-29 Thread Robert Haas
On Wed, Jun 29, 2011 at 1:50 PM, Simon Riggs si...@2ndquadrant.com wrote:
 As implemented, the feature will work with either streaming
 replication or with file-based replication.

 That sounds like the exact opposite of yours and Fujii's comments
 above. Please explain.

I think our comments above were addressing the issue of whether it's
feasible to correct for time skew between the master and the slave.
Tom was arguing that we should try, but I was arguing that any system
we put together is likely to be pretty unreliable (since good time
synchronization algorithms are quite complex, and to my knowledge no
one here is an expert on implementing them, nor do I think we want
that much complexity in the backend) and Fujii was pointing out that
it won't work at all if the WAL files are going through the archive
rather than through streaming replication, which (if I understand you
correctly) will be a more common case than I had assumed.

 I don't see any value in
 restricting to work ONLY with file-based replication.

 As explained above, it won't work in practice because of the amount of
 file space required.

I guess it depends on how busy your system is and how much disk space
you have.  If using streaming replication causes pg_xlog to fill up on
your standby, then you can either (1) put pg_xlog on a larger file
system or (2) configure only restore_command and not primary_conninfo,
so that only the archive is used.

 Or, an alternative question: what will you do when it waits so long
 that the standby runs out of disk space?

I don't really see how that's any different from what happens now.  If
(for whatever reason) the master is generating WAL faster than a
streaming standby can replay it, then the excess WAL is going to pile
up someplace, and you might run out of disk space.   Time-delaying the
standby creates an additional way for that to happen, but I don't
think it's an entirely new problem.

I am not sure exactly how walreceiver handles it if the disk is full.
I assume it craps out and eventually retries, so probably what will
happen is that, after the standby's pg_xlog directory fills up,
walreceiver will sit there and error out until replay advances enough
to remove a WAL file and thus permit some more data to be streamed.
If the standby gets far enough behind the master that the required
files are no longer there, then it will switch to the archive, if
available.  It might be nice to have a mode that only allows streaming
replication when the amount of disk space on the standby is greater
than or equal to some threshold, but that seems like a topic for
another patch.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] time-delayed standbys

2011-06-29 Thread Josh Berkus
Robert,

 I don't really see how that's any different from what happens now.  If
 (for whatever reason) the master is generating WAL faster than a
 streaming standby can replay it, then the excess WAL is going to pile
 up someplace, and you might run out of disk space.   Time-delaying the
 standby creates an additional way for that to happen, but I don't
 think it's an entirely new problem.

Not remotely new.  xlog partition full is currently 75% of the emergency
support calls PGX gets from clients on 9.0 (if only they'd pay attention
to their nagios alerts!)

 I am not sure exactly how walreceiver handles it if the disk is full.
 I assume it craps out and eventually retries, so probably what will
 happen is that, after the standby's pg_xlog directory fills up,
 walreceiver will sit there and error out until replay advances enough
 to remove a WAL file and thus permit some more data to be streamed.

Nope, it gets stuck and stops there.  Replay doesn't advance unless you
can somehow clear out some space manually; if the disk is full, the disk
is full, and PostgreSQL doesn't remove WAL files without being able to
write files first.

Manual (or scripted) intervention is always necessary if you reach disk
100% full.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] time-delayed standbys

2011-06-29 Thread Josh Berkus
On 6/29/11 11:11 AM, Robert Haas wrote:
 If the standby gets far enough behind the master that the required
 files are no longer there, then it will switch to the archive, if
 available.

One more thing:

As I understand it (and my testing shows this), the standby *prefers*
the archive logs, and won't switch to streaming until it reaches the end
of the archive logs.  This is desirable behavior, as it minimizes the
load on the master.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] time-delayed standbys

2011-06-29 Thread Robert Haas
On Wed, Jun 29, 2011 at 9:54 PM, Josh Berkus j...@agliodbs.com wrote:
 I am not sure exactly how walreceiver handles it if the disk is full.
 I assume it craps out and eventually retries, so probably what will
 happen is that, after the standby's pg_xlog directory fills up,
 walreceiver will sit there and error out until replay advances enough
 to remove a WAL file and thus permit some more data to be streamed.

 Nope, it gets stuck and stops there.  Replay doesn't advance unless you
 can somehow clear out some space manually; if the disk is full, the disk
 is full, and PostgreSQL doesn't remove WAL files without being able to
 write files first.

 Manual (or scripted) intervention is always necessary if you reach disk
 100% full.

Wow, that's a pretty crappy failure mode... but I don't think we need
to fix it just on account of this patch.  It would be nice to fix, of
course.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] time-delayed standbys

2011-06-29 Thread Fujii Masao
On Wed, Jun 29, 2011 at 11:14 AM, Robert Haas robertmh...@gmail.com wrote:
 On Wed, Jun 15, 2011 at 1:58 AM, Fujii Masao masao.fu...@gmail.com wrote:
 After we run pg_ctl promote, time-delayed replication should be disabled?
 Otherwise, failover might take very long time when we set recovery_time_delay
 to high value.

 PFA a patch that I believe will disable recovery_time_delay after
 promotion.  The only change from the previous version is:

 diff --git a/src/backend/access/transam/xlog.c 
 b/src/backend/access/transam/xlog
 index 1dbf792..41b3ae9 100644
 --- a/src/backend/access/transam/xlog.c
 +++ b/src/backend/access/transam/xlog.c
 @@ -5869,7 +5869,7 @@ pg_is_xlog_replay_paused(PG_FUNCTION_ARGS)
  static void
  recoveryDelay(void)
  {
 -       while (1)
 +       while (!CheckForStandbyTrigger())
        {
                long    secs;
                int             microsecs;

Thanks for updating patch! I have a few comments;

ISTM recoveryDelayUntilTime needs to be calculated also when replaying
the commit
*compact* WAL record (i.e., record_info == XLOG_XACT_COMMIT_COMPACT).

When the user uses only two-phase commit on the master, ISTM he or she cannot
use this feature. Because recoveryDelayUntilTime is never set in that
case. Is this
intentional?

We should disable this feature also after recovery reaches the stop
point (specified
in recovery_target_xxx)?

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] time-delayed standbys

2011-06-29 Thread Fujii Masao
On Thu, Jun 30, 2011 at 10:56 AM, Robert Haas robertmh...@gmail.com wrote:
 Nope, it gets stuck and stops there.  Replay doesn't advance unless you
 can somehow clear out some space manually; if the disk is full, the disk
 is full, and PostgreSQL doesn't remove WAL files without being able to
 write files first.

 Manual (or scripted) intervention is always necessary if you reach disk
 100% full.

 Wow, that's a pretty crappy failure mode... but I don't think we need
 to fix it just on account of this patch.  It would be nice to fix, of
 course.

Yeah, we need to fix that as a separate patch. The difficult point is that
we cannot delete WAL files until we replay the checkpoint record and
restartpoint occurs. But, if the disk is full, there would be no space to
receive the checkpoint record, so we cannot WAL files infinitely.

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] time-delayed standbys

2011-06-29 Thread Fujii Masao
On Thu, Jun 30, 2011 at 12:14 PM, Fujii Masao masao.fu...@gmail.com wrote:
 We should disable this feature also after recovery reaches the stop
 point (specified in recovery_target_xxx)?

Another comment; it's very helpful to document the behavior of delayed standby
when promoting or after reaching the stop point.

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] time-delayed standbys

2011-06-28 Thread Robert Haas
On Wed, Jun 15, 2011 at 1:58 AM, Fujii Masao masao.fu...@gmail.com wrote:
 After we run pg_ctl promote, time-delayed replication should be disabled?
 Otherwise, failover might take very long time when we set recovery_time_delay
 to high value.

PFA a patch that I believe will disable recovery_time_delay after
promotion.  The only change from the previous version is:

diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog
index 1dbf792..41b3ae9 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -5869,7 +5869,7 @@ pg_is_xlog_replay_paused(PG_FUNCTION_ARGS)
 static void
 recoveryDelay(void)
 {
-   while (1)
+   while (!CheckForStandbyTrigger())
{
longsecs;
int microsecs;

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


time-delayed-standby-v2.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] time-delayed standbys

2011-06-19 Thread Fujii Masao
On Fri, Jun 17, 2011 at 11:34 AM, Robert Haas robertmh...@gmail.com wrote:
 On Thu, Jun 16, 2011 at 10:10 PM, Fujii Masao masao.fu...@gmail.com wrote:
 According to the above page, one purpose of time-delayed replication is to
 protect against user mistakes on master. But, when an user notices his 
 wrong
 operation on master, what should he do next? The WAL records of his wrong
 operation might have already arrived at the standby, so neither promote 
 nor
 restart doesn't cancel that wrong operation. Instead, probably he should
 shutdown the standby, investigate the timestamp of XID of the operation
 he'd like to cancel, set recovery_target_time and restart the standby.
 Something like this procedures should be documented? Or, we should
 implement new promote mode which finishes a recovery as soon as
 promote is requested (i.e., not replay all the available WAL records)?

 I like the idea of a new promote mode;

 Are you going to implement that mode in this CF? or next one?

 I wasn't really planning on it - I thought you might want to take a
 crack at it.  The feature is usable without that, just maybe a bit
 less cool.

Right.

 Certainly, it's too late for any more formal submissions
 to this CF, but I wouldn't mind reviewing a patch if you want to write
 one.

Okay, I add that into my TODO list. But I might not have enough time
to develop that.
So, everyone, please feel free to implement that if you want!

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] time-delayed standbys

2011-06-16 Thread Robert Haas
On Wed, Jun 15, 2011 at 1:58 AM, Fujii Masao masao.fu...@gmail.com wrote:
 When the replication connection is terminated, the standby tries to read
 WAL files from the archive. In this case, there is no walreceiver process,
 so how does the standby calculate the clock difference?

Good question.  Also, just because we have streaming replication
available doesn't mean that we should force people to use it.  It's
still perfectly legit to set up a standby that only use
archive_command and restore_command, and it would be nice if this
feature could still work in such an environment.  I anticipate that
most people want to use streaming replication, but a time-delayed
standby is a good example of a case where you might decide you don't
need it.  It could be useful to have all the WAL present (but not yet
applied) if you're thinking you might want to promote that standby -
but my guess is that in many cases, the time-delayed standby will be
*in addition* to one or more regular standbys that would be the
primary promotion candidates.  So I can see someone deciding that
they'd rather not have the load of another walsender on the master,
and just let the time-delayed standby read from the archive.

Even if that were not an issue, I'm still more or less of the opinion
that trying to solve the time synchronization problem is a rathole
anyway.  To really solve this problem well, you're going to need the
standby to send a message containing a timestamp, get a reply back
from the master that contains that timestamp and a master timestamp,
and then compute based on those two timestamps plus the reply
timestamp the maximum and minimum possible lag between the two
machines.  Then you're going to need to guess, based on several cycles
of this activity, what the actual lag is, and adjust it over time (but
not too quckly, unless of course a large manual step has occurred) as
the clocks potentially drift apart from each other.  This is basically
what ntpd does, except that it can be virtually guaranteed that our
implementation will suck by comparison.  Time synchronization is
neither easy nor our core competency, and I think trying to include it
in this feature is going to result in a net loss of reliability.

 errmsg(parameter \%s\ requires a temporal value, recovery_time_delay),

 We should s/a temporal/an Integer?

It seems strange to ask for an integer when what we want is an amount
of time in seconds or minutes...

 After we run pg_ctl promote, time-delayed replication should be disabled?
 Otherwise, failover might take very long time when we set recovery_time_delay
 to high value.

Yeah, I think so.

 http://forge.mysql.com/worklog/task.php?id=344
 According to the above page, one purpose of time-delayed replication is to
 protect against user mistakes on master. But, when an user notices his wrong
 operation on master, what should he do next? The WAL records of his wrong
 operation might have already arrived at the standby, so neither promote nor
 restart doesn't cancel that wrong operation. Instead, probably he should
 shutdown the standby, investigate the timestamp of XID of the operation
 he'd like to cancel, set recovery_target_time and restart the standby.
 Something like this procedures should be documented? Or, we should
 implement new promote mode which finishes a recovery as soon as
 promote is requested (i.e., not replay all the available WAL records)?

I like the idea of a new promote mode; and documenting the other
approach you mention doesn't sound bad either.  Either one sounds like
a job for a separate patch, though.

The other option is to pause recovery and run pg_dump...

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] time-delayed standbys

2011-06-16 Thread Fujii Masao
On Fri, Jun 17, 2011 at 3:29 AM, Robert Haas robertmh...@gmail.com wrote:
 Even if that were not an issue, I'm still more or less of the opinion
 that trying to solve the time synchronization problem is a rathole
 anyway.  To really solve this problem well, you're going to need the
 standby to send a message containing a timestamp, get a reply back
 from the master that contains that timestamp and a master timestamp,
 and then compute based on those two timestamps plus the reply
 timestamp the maximum and minimum possible lag between the two
 machines.  Then you're going to need to guess, based on several cycles
 of this activity, what the actual lag is, and adjust it over time (but
 not too quckly, unless of course a large manual step has occurred) as
 the clocks potentially drift apart from each other.  This is basically
 what ntpd does, except that it can be virtually guaranteed that our
 implementation will suck by comparison.  Time synchronization is
 neither easy nor our core competency, and I think trying to include it
 in this feature is going to result in a net loss of reliability.

Agreed. You've already added the note about time synchronization into
the document. That's enough, I think.

 errmsg(parameter \%s\ requires a temporal value, recovery_time_delay),

 We should s/a temporal/an Integer?

 It seems strange to ask for an integer when what we want is an amount
 of time in seconds or minutes...

OK.

 http://forge.mysql.com/worklog/task.php?id=344
 According to the above page, one purpose of time-delayed replication is to
 protect against user mistakes on master. But, when an user notices his wrong
 operation on master, what should he do next? The WAL records of his wrong
 operation might have already arrived at the standby, so neither promote nor
 restart doesn't cancel that wrong operation. Instead, probably he should
 shutdown the standby, investigate the timestamp of XID of the operation
 he'd like to cancel, set recovery_target_time and restart the standby.
 Something like this procedures should be documented? Or, we should
 implement new promote mode which finishes a recovery as soon as
 promote is requested (i.e., not replay all the available WAL records)?

 I like the idea of a new promote mode;

Are you going to implement that mode in this CF? or next one?

 and documenting the other
 approach you mention doesn't sound bad either.  Either one sounds like
 a job for a separate patch, though.

 The other option is to pause recovery and run pg_dump...

Yes, please.

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] time-delayed standbys

2011-06-16 Thread Robert Haas
On Thu, Jun 16, 2011 at 10:10 PM, Fujii Masao masao.fu...@gmail.com wrote:
 According to the above page, one purpose of time-delayed replication is to
 protect against user mistakes on master. But, when an user notices his wrong
 operation on master, what should he do next? The WAL records of his wrong
 operation might have already arrived at the standby, so neither promote 
 nor
 restart doesn't cancel that wrong operation. Instead, probably he should
 shutdown the standby, investigate the timestamp of XID of the operation
 he'd like to cancel, set recovery_target_time and restart the standby.
 Something like this procedures should be documented? Or, we should
 implement new promote mode which finishes a recovery as soon as
 promote is requested (i.e., not replay all the available WAL records)?

 I like the idea of a new promote mode;

 Are you going to implement that mode in this CF? or next one?

I wasn't really planning on it - I thought you might want to take a
crack at it.  The feature is usable without that, just maybe a bit
less cool.  Certainly, it's too late for any more formal submissions
to this CF, but I wouldn't mind reviewing a patch if you want to write
one.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] time-delayed standbys

2011-06-15 Thread Jaime Casanova
On Wed, Jun 15, 2011 at 12:58 AM, Fujii Masao masao.fu...@gmail.com wrote:

 http://forge.mysql.com/worklog/task.php?id=344
 According to the above page, one purpose of time-delayed replication is to
 protect against user mistakes on master. But, when an user notices his wrong
 operation on master, what should he do next? The WAL records of his wrong
 operation might have already arrived at the standby, so neither promote nor
 restart doesn't cancel that wrong operation. Instead, probably he should
 shutdown the standby, investigate the timestamp of XID of the operation
 he'd like to cancel, set recovery_target_time and restart the standby.
 Something like this procedures should be documented? Or, we should
 implement new promote mode which finishes a recovery as soon as
 promote is requested (i.e., not replay all the available WAL records)?


i would prefer something like pg_ctl promote -m immediate that
terminates the recovery

-- 
Jaime Casanova         www.2ndQuadrant.com
Professional PostgreSQL: Soporte 24x7 y capacitación

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] time-delayed standbys

2011-06-14 Thread Fujii Masao
On Thu, Apr 21, 2011 at 12:18 PM, Robert Haas robertmh...@gmail.com wrote:
 On Wed, Apr 20, 2011 at 11:15 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 Robert Haas robertmh...@gmail.com writes:
 I am a bit concerned about the reliability of this approach.  If there
 is some network lag, or some lag in processing from the master, we
 could easily get the idea that there is time skew between the machines
 when there really isn't.  And our perception of the time skew could
 easily bounce around from message to message, as the lag varies.  I
 think it would be tremendously ironic of the two machines were
 actually synchronized to the microsecond, but by trying to be clever
 about it we managed to make the lag-time accurate only to within
 several seconds.

 Well, if walreceiver concludes that there is no more than a few seconds'
 difference between the clocks, it'd probably be OK to take the master
 timestamps at face value.  The problem comes when the skew gets large
 (compared to the configured time delay, I guess).

 I suppose.  Any bound on how much lag there can be before we start
 applying to skew correction is going to be fairly arbitrary.

When the replication connection is terminated, the standby tries to read
WAL files from the archive. In this case, there is no walreceiver process,
so how does the standby calculate the clock difference?

 errmsg(parameter \%s\ requires a temporal value, recovery_time_delay),

We should s/a temporal/an Integer?

After we run pg_ctl promote, time-delayed replication should be disabled?
Otherwise, failover might take very long time when we set recovery_time_delay
to high value.

http://forge.mysql.com/worklog/task.php?id=344
According to the above page, one purpose of time-delayed replication is to
protect against user mistakes on master. But, when an user notices his wrong
operation on master, what should he do next? The WAL records of his wrong
operation might have already arrived at the standby, so neither promote nor
restart doesn't cancel that wrong operation. Instead, probably he should
shutdown the standby, investigate the timestamp of XID of the operation
he'd like to cancel, set recovery_target_time and restart the standby.
Something like this procedures should be documented? Or, we should
implement new promote mode which finishes a recovery as soon as
promote is requested (i.e., not replay all the available WAL records)?

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] time-delayed standbys

2011-05-11 Thread Heikki Linnakangas

On 07.05.2011 16:48, Robert Haas wrote:

I was able to reproduce something very like this in unpatched master,
just by letting recovery pause at a named restore point, and then
resuming it.

LOG:  recovery stopping at restore point stop, time 2011-05-07
09:28:01.652958-04
LOG:  recovery has paused
HINT:  Execute pg_xlog_replay_resume() to continue.
(at this point I did pg_xlog_replay_resume())
LOG:  redo done at 0/520
PANIC:  wal receiver still active
LOG:  startup process (PID 38762) was terminated by signal 6: Abort trap
LOG:  terminating any other active server processes

I'm thinking that this code is wrong:

 if (recoveryPauseAtTarget  standbyState ==
STANDBY_SNAPSHOT_READY)
 {
 SetRecoveryPause(true);
 recoveryPausesHere();
 }
 reachedStopPoint = true;/* see below */
 recoveryContinue = false;

I think that recoveryContinue = false assignment should not happen if
we decide to pause.  That is, we should say if (recoveryPauseAtTarget
  standbyState == STANDBY_SNAPSHOT_READY) { same as now } else
recoveryContinue = false.


No, recovery stops at that point whether or not you pause. Resuming 
after stopping at the recovery target doesn't mean that you resume 
recovery, it means that you resume to end recovery and start up the 
server (see the 2nd to last paragraph at 
http://www.postgresql.org/docs/9.1/static/recovery-target-settings.html). It 
would probably be more useful to allow a new stopping target to be set 
and continue recovery, but the current pause/resume functions don't 
allow that.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] time-delayed standbys

2011-05-11 Thread Heikki Linnakangas

On 11.05.2011 08:29, Fujii Masao wrote:

On Sat, May 7, 2011 at 10:48 PM, Robert Haasrobertmh...@gmail.com  wrote:

I was able to reproduce something very like this in unpatched master,
just by letting recovery pause at a named restore point, and then
resuming it.


I was able to reproduce the same problem even in 9.0. When the standby
reaches the recovery target, it always tries to end the recovery even
though walreceiver is still running, which causes the problem. This seems
to be an oversight in streaming replication. I should have considered how
the standby should work when recovery_target is specified.

What about the attached patch? Which stops walreceiver instead of
emitting PANIC there only if we've reached the recovery target.


I think we can just always call ShutdownWalRcv(). It should be gone if 
the server was promoted while streaming, but that's just an 
implementation detail of what the promotion code does. There's no hard 
reason why it shouldn't be running at that point anymore, as long as we 
kill it before going any further.


Committed a patch to do that.

--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] time-delayed standbys

2011-05-11 Thread Fujii Masao
On Wed, May 11, 2011 at 6:50 PM, Heikki Linnakangas
heikki.linnakan...@enterprisedb.com wrote:
 I think we can just always call ShutdownWalRcv(). It should be gone if the
 server was promoted while streaming, but that's just an implementation
 detail of what the promotion code does. There's no hard reason why it
 shouldn't be running at that point anymore, as long as we kill it before
 going any further.

Okay. But I'd like to add the following assertion check just before
ShutdownWalRcv() which you added, in order to detect such a bug
that we found this time, i.e., the bug which causes unexpected end
of recovery. Thought?

Assert(reachedStopPoint || !WalRcvInProgress())

 Committed a patch to do that.

Thanks. Should we backport it to 9.0? 9.0 has the same problem.

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] time-delayed standbys

2011-05-11 Thread Heikki Linnakangas

On 11.05.2011 14:16, Fujii Masao wrote:

On Wed, May 11, 2011 at 6:50 PM, Heikki Linnakangas
heikki.linnakan...@enterprisedb.com  wrote:

I think we can just always call ShutdownWalRcv(). It should be gone if the
server was promoted while streaming, but that's just an implementation
detail of what the promotion code does. There's no hard reason why it
shouldn't be running at that point anymore, as long as we kill it before
going any further.


Okay. But I'd like to add the following assertion check just before
ShutdownWalRcv() which you added, in order to detect such a bug
that we found this time, i.e., the bug which causes unexpected end
of recovery. Thought?

 Assert(reachedStopPoint || !WalRcvInProgress())


There's no unexpected end of recovery here. The recovery ends when we 
reach the target, as it should. It was the assumption that WAL receiver 
can't be running at that point anymore that was wrong.


That assertion would work, AFAICS, but I don't think it's something we 
need to assert. There isn't any harm done if WAL receiver is still 
running, as long as we shut it down at that point.



Committed a patch to do that.


Thanks. Should we backport it to 9.0? 9.0 has the same problem.


Ah, thanks, missed that, Cherry-picked to 9.0 now as well.

--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] time-delayed standbys

2011-05-10 Thread Fujii Masao
On Sat, May 7, 2011 at 10:48 PM, Robert Haas robertmh...@gmail.com wrote:
 I was able to reproduce something very like this in unpatched master,
 just by letting recovery pause at a named restore point, and then
 resuming it.

I was able to reproduce the same problem even in 9.0. When the standby
reaches the recovery target, it always tries to end the recovery even
though walreceiver is still running, which causes the problem. This seems
to be an oversight in streaming replication. I should have considered how
the standby should work when recovery_target is specified.

What about the attached patch? Which stops walreceiver instead of
emitting PANIC there only if we've reached the recovery target.

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center


recovery_target_v1.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] time-delayed standbys

2011-05-07 Thread Robert Haas
On Sat, Apr 23, 2011 at 9:46 PM, Jaime Casanova ja...@2ndquadrant.com wrote:
 On Tue, Apr 19, 2011 at 9:47 PM, Robert Haas robertmh...@gmail.com wrote:

 That is, a standby configured such that replay lags a prescribed
 amount of time behind the master.

 This seemed easy to implement, so I did.  Patch (for 9.2, obviously) 
 attached.


 This crashes when stoping recovery to a target (i tried with a named
 restore point and with a poin in time) after executing
 pg_xlog_replay_resume(). here is the backtrace. I will try to check
 later but i wanted to report it before...

 #0  0xb537 in raise () from /lib/libc.so.6
 #1  0xb777a922 in abort () from /lib/libc.so.6
 #2  0x08393a19 in errfinish (dummy=0) at elog.c:513
 #3  0x083944ba in elog_finish (elevel=22, fmt=0x83d5221 wal receiver
 still active) at elog.c:1156
 #4  0x080f04cb in StartupXLOG () at xlog.c:6691
 #5  0x080f2825 in StartupProcessMain () at xlog.c:10050
 #6  0x0811468f in AuxiliaryProcessMain (argc=2, argv=0xbfa326a8) at
 bootstrap.c:417
 #7  0x0827c2ea in StartChildProcess (type=StartupProcess) at postmaster.c:4488
 #8  0x08280b85 in PostmasterMain (argc=3, argv=0xa4c17e8) at postmaster.c:1106
 #9  0x0821730f in main (argc=3, argv=0xa4c17e8) at main.c:199

Sorry for the slow response on this - I was on vacation for a week and
my schedule got a big hole in it.

I was able to reproduce something very like this in unpatched master,
just by letting recovery pause at a named restore point, and then
resuming it.

LOG:  recovery stopping at restore point stop, time 2011-05-07
09:28:01.652958-04
LOG:  recovery has paused
HINT:  Execute pg_xlog_replay_resume() to continue.
(at this point I did pg_xlog_replay_resume())
LOG:  redo done at 0/520
PANIC:  wal receiver still active
LOG:  startup process (PID 38762) was terminated by signal 6: Abort trap
LOG:  terminating any other active server processes

I'm thinking that this code is wrong:

if (recoveryPauseAtTarget  standbyState ==
STANDBY_SNAPSHOT_READY)
{
SetRecoveryPause(true);
recoveryPausesHere();
}
reachedStopPoint = true;/* see below */
recoveryContinue = false;

I think that recoveryContinue = false assignment should not happen if
we decide to pause.  That is, we should say if (recoveryPauseAtTarget
 standbyState == STANDBY_SNAPSHOT_READY) { same as now } else
recoveryContinue = false.

I haven't tested that, though.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] time-delayed standbys

2011-04-23 Thread Jaime Casanova
On Tue, Apr 19, 2011 at 9:47 PM, Robert Haas robertmh...@gmail.com wrote:

 That is, a standby configured such that replay lags a prescribed
 amount of time behind the master.

 This seemed easy to implement, so I did.  Patch (for 9.2, obviously) attached.


This crashes when stoping recovery to a target (i tried with a named
restore point and with a poin in time) after executing
pg_xlog_replay_resume(). here is the backtrace. I will try to check
later but i wanted to report it before...

#0  0xb537 in raise () from /lib/libc.so.6
#1  0xb777a922 in abort () from /lib/libc.so.6
#2  0x08393a19 in errfinish (dummy=0) at elog.c:513
#3  0x083944ba in elog_finish (elevel=22, fmt=0x83d5221 wal receiver
still active) at elog.c:1156
#4  0x080f04cb in StartupXLOG () at xlog.c:6691
#5  0x080f2825 in StartupProcessMain () at xlog.c:10050
#6  0x0811468f in AuxiliaryProcessMain (argc=2, argv=0xbfa326a8) at
bootstrap.c:417
#7  0x0827c2ea in StartChildProcess (type=StartupProcess) at postmaster.c:4488
#8  0x08280b85 in PostmasterMain (argc=3, argv=0xa4c17e8) at postmaster.c:1106
#9  0x0821730f in main (argc=3, argv=0xa4c17e8) at main.c:199


-- 
Jaime Casanova         www.2ndQuadrant.com
Professional PostgreSQL: Soporte y capacitación de PostgreSQL

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


  1   2   >