Re: [HACKERS] Use pg_rewind when target timeline was switched

2015-12-01 Thread Michael Paquier
On Wed, Dec 2, 2015 at 12:57 AM, Teodor Sigaev  wrote:
> Thank you, committed.

Youhou, thanks!
I still think that we should have a test case for this patch close to
the script I sent on this thread. I'll get into it once the
infrastructure patch for recovery regression tests gets in.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Use pg_rewind when target timeline was switched

2015-12-01 Thread Teodor Sigaev

Thank you, committed.

Alexander Korotkov wrote:

On Sat, Nov 28, 2015 at 2:27 PM, Michael Paquier > wrote:

On Fri, Nov 27, 2015 at 11:42 PM, Teodor Sigaev > wrote:
> Seems, patch is ready to commit. But it needs some documentation.

Of what kind? The documentation of pg_rewind is rather explicit on the
subject and looks fine as-is, and that's what Alexander and I agreed
on upthread. If there is something forgotten though, this may be the
fact that we do not mention in 9.5's documentation that pg_rewind can
*not* handle timeline switches.


​However, we can add some explicit statements​ about new pg_rewind capabilities.
Please, check attached patch for those statements.

--
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com 
The Russian Postgres Company





--
Teodor Sigaev   E-mail: teo...@sigaev.ru
   WWW: http://www.sigaev.ru/


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Use pg_rewind when target timeline was switched

2015-12-01 Thread Alexander Korotkov
On Sat, Nov 28, 2015 at 2:27 PM, Michael Paquier 
wrote:

> On Fri, Nov 27, 2015 at 11:42 PM, Teodor Sigaev  wrote:
> > Seems, patch is ready to commit. But it needs some documentation.
>
> Of what kind? The documentation of pg_rewind is rather explicit on the
> subject and looks fine as-is, and that's what Alexander and I agreed
> on upthread. If there is something forgotten though, this may be the
> fact that we do not mention in 9.5's documentation that pg_rewind can
> *not* handle timeline switches.
>

​However, we can add some explicit statements​ about new pg_rewind
capabilities. Please, check attached patch for those statements.

--
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


pg-rewind-target-switch-7.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Use pg_rewind when target timeline was switched

2015-11-28 Thread Michael Paquier
On Fri, Nov 27, 2015 at 11:42 PM, Teodor Sigaev  wrote:
> Seems, patch is ready to commit. But it needs some documentation.

Of what kind? The documentation of pg_rewind is rather explicit on the
subject and looks fine as-is, and that's what Alexander and I agreed
on upthread. If there is something forgotten though, this may be the
fact that we do not mention in 9.5's documentation that pg_rewind can
*not* handle timeline switches.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Use pg_rewind when target timeline was switched

2015-11-27 Thread Teodor Sigaev

Seems, patch is ready to commit. But it needs some documentation.

Alexander Korotkov wrote:

On Wed, Sep 30, 2015 at 9:46 AM, Michael Paquier > wrote:

On Sat, Sep 19, 2015 at 8:27 AM, Michael Paquier wrote:
> On Fri, Sep 18, 2015 at 6:25 PM, Michael Paquier wrote:
>> The refactoring of getTimelineHistory as you propose looks like a good
>> idea to me, I tried to remove by myself the difference between source
>> and target in copy_fetch.c and friends but this gets uglier,
>> particularly because of datadir_source in copy_file_range. Not worth
>> it.
>
> Forgot that:
> if (ControlFile_target.state != DB_SHUTDOWNED)
> pg_fatal("target server must be shut down cleanly\n");
> We may want to allow a target node shutdowned in recovery as well here.

So, attached is a more polished version of this patch, cleaned up of
its typos with as well other things. I have noticed for example that
it would be more useful to add the debug information of a timeline
file fetched from the source or a target server directly in
getTimelineHistory. I have as well updated a couple of comments in the
code regarding the fact that we do not necessarily use a master as a
target node, and mentioned in findCommonAncestorTimeline that we check
as well the start position of a timeline to cover the case where both
target and source node forked at the same timeline number but with a
different WAL fork position.
I am marking this patch as ready for committer. It would be cool in
the future to use the recovery test suite to have more advanced
scenarios tested, but it seems a shame to block this patch because of
that.


Thanks a lot. Now patch looks much better.
I found that it doesn't applies cleanly on the current master. Rebased version
is attached.

--
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com 
The Russian Postgres Company





--
Teodor Sigaev   E-mail: teo...@sigaev.ru
   WWW: http://www.sigaev.ru/


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Use pg_rewind when target timeline was switched

2015-10-14 Thread Alexander Korotkov
On Wed, Sep 30, 2015 at 9:46 AM, Michael Paquier 
wrote:

> On Sat, Sep 19, 2015 at 8:27 AM, Michael Paquier wrote:
> > On Fri, Sep 18, 2015 at 6:25 PM, Michael Paquier wrote:
> >> The refactoring of getTimelineHistory as you propose looks like a good
> >> idea to me, I tried to remove by myself the difference between source
> >> and target in copy_fetch.c and friends but this gets uglier,
> >> particularly because of datadir_source in copy_file_range. Not worth
> >> it.
> >
> > Forgot that:
> > if (ControlFile_target.state != DB_SHUTDOWNED)
> > pg_fatal("target server must be shut down cleanly\n");
> > We may want to allow a target node shutdowned in recovery as well here.
>
> So, attached is a more polished version of this patch, cleaned up of
> its typos with as well other things. I have noticed for example that
> it would be more useful to add the debug information of a timeline
> file fetched from the source or a target server directly in
> getTimelineHistory. I have as well updated a couple of comments in the
> code regarding the fact that we do not necessarily use a master as a
> target node, and mentioned in findCommonAncestorTimeline that we check
> as well the start position of a timeline to cover the case where both
> target and source node forked at the same timeline number but with a
> different WAL fork position.
> I am marking this patch as ready for committer. It would be cool in
> the future to use the recovery test suite to have more advanced
> scenarios tested, but it seems a shame to block this patch because of
> that.
>

Thanks a lot. Now patch looks much better.
I found that it doesn't applies cleanly on the current master. Rebased
version is attached.

--
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


pg-rewind-target-switch-6.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Use pg_rewind when target timeline was switched

2015-10-14 Thread Alexander Korotkov
On Sat, Sep 19, 2015 at 2:25 AM, Michael Paquier 
wrote:

> OK, I see your point and you are right. This additional check allows
> pg_rewind to switch one timeline back and make the scan of blocks
> begin at the real origin of both timelines. I had in mind the case
> where you tried to actually rewind node 2 to node 3 actually which
> would not be possible with your patch, and by thinking more about that
> I think that it could be possible to remove the check I am listing
> below and rely on the checks in the history files, basically what is
> in findCommonAncestorTimeline:
> if (ControlFile_target.checkPointCopy.ThisTimeLineID ==
> ControlFile_source.checkPointCopy.ThisTimeLineID)
> pg_fatal("source and target cluster are on the same
> timeline\n");
> Alexander, what do you think about that? I think that we should be
> able to rewind with for example node 2 as target and node 3 as source,
> and vice-versa as per the example you gave even if both nodes are on
> the same timeline, just that they forked at a different point. Could
> you test this particular case? Using my script, simply be sure to
> switch archive_mode to on/off depending on the node, aka only 3 and 4
> do archiving.
>

I think relying on different fork point is not safe enough. Imagine more
complex case.

  1
 / \
2   3
|   |
4   5

At first, nodes 2 and 3 are promoted at the same point and they both get
timeline 2.
Then nodes 4 and 5 are promoted at different points and they both get
timeline 3.
Then we can try to rewind node 4 with node 5 as the source or vice versa.
In this case we can't find collision of timeline 2.

The same collision could happen even when source and target are on the
different timeline number. However, having the on the same timeline numbers
is suspicious enough to disallow it until we have additional checks.

I could propose following plan:

   1. Commit this patch without allowing rewind when target and source are
   on the same timelines.
   2. Make additional checks for distinguish different timelines with the
   same numbers.
   3. Allow rewind when target and source are on the same timelines.

--
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


Re: [HACKERS] Use pg_rewind when target timeline was switched

2015-10-14 Thread Michael Paquier
On Wed, Oct 14, 2015 at 6:00 PM, Alexander Korotkov wrote:
> On Sat, Sep 19, 2015 at 2:25 AM, Michael Paquier wrote:
>> Alexander, what do you think about that? I think that we should be
>> able to rewind with for example node 2 as target and node 3 as source,
>> and vice-versa as per the example you gave even if both nodes are on
>> the same timeline, just that they forked at a different point. Could
>> you test this particular case? Using my script, simply be sure to
>> switch archive_mode to on/off depending on the node, aka only 3 and 4
>> do archiving.
>
> I think relying on different fork point is not safe enough. Imagine more
> complex case.
>
>   1
>  / \
> 2   3
> |   |
> 4   5
>
> At first, nodes 2 and 3 are promoted at the same point and they both get
> timeline 2.
> Then nodes 4 and 5 are promoted at different points and they both get
> timeline 3.

You mean that those nodes get the history file generated at timeline 2
by either 2 or 3 here I guess.

> Then we can try to rewind node 4 with node 5 as the source or vice versa. In
> this case we can't find collision of timeline 2.
> The same collision could happen even when source and target are on the
> different timeline number. However, having the on the same timeline numbers
> is suspicious enough to disallow it until we have additional checks.

Check.

> I could propose following plan:
> Commit this patch without allowing rewind when target and source are on the
> same timelines.
> Make additional checks for distinguish different timelines with the same
> numbers.

That's the addition of the 8-byte timeline ID in the XLOG segment
header. We would need to store it in the control data file as well.

> Allow rewind when target and source are on the same timelines.

Check. This relies on the fact that a portion of the nodes select
their new timeline without knowing the previous history. I believe
that this does not exist much in a perfect world, but I've seen enough
setups messed up to conclude that this would surely exist on the
field. In short this plan looks fine to me. Your patch is very
valuable even with this same-timeline restriction in place.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Use pg_rewind when target timeline was switched

2015-09-30 Thread Michael Paquier
On Sat, Sep 19, 2015 at 8:27 AM, Michael Paquier wrote:
> On Fri, Sep 18, 2015 at 6:25 PM, Michael Paquier wrote:
>> The refactoring of getTimelineHistory as you propose looks like a good
>> idea to me, I tried to remove by myself the difference between source
>> and target in copy_fetch.c and friends but this gets uglier,
>> particularly because of datadir_source in copy_file_range. Not worth
>> it.
>
> Forgot that:
> if (ControlFile_target.state != DB_SHUTDOWNED)
> pg_fatal("target server must be shut down cleanly\n");
> We may want to allow a target node shutdowned in recovery as well here.

So, attached is a more polished version of this patch, cleaned up of
its typos with as well other things. I have noticed for example that
it would be more useful to add the debug information of a timeline
file fetched from the source or a target server directly in
getTimelineHistory. I have as well updated a couple of comments in the
code regarding the fact that we do not necessarily use a master as a
target node, and mentioned in findCommonAncestorTimeline that we check
as well the start position of a timeline to cover the case where both
target and source node forked at the same timeline number but with a
different WAL fork position.
I am marking this patch as ready for committer. It would be cool in
the future to use the recovery test suite to have more advanced
scenarios tested, but it seems a shame to block this patch because of
that.
Regards,
-- 
Michael
From 563f49ac3b07a49e89844112887eee2d4d982879 Mon Sep 17 00:00:00 2001
From: Michael Paquier 
Date: Wed, 30 Sep 2015 15:39:34 +0900
Subject: [PATCH] Handle timeline switches in pg_rewind

This feature allows pg_rewind to handle data directory synchronization
is much more general way, one example being to be able to return a
promoted standby to its old master. This patch softens the shutdown
condition of a target and source node (when this one's data directory
is used as a synchronization point) to allow the use of nodes that
have been shutdown while in recovery.

Patch by Alexander Korotkov.
---
 src/bin/pg_rewind/Makefile|   2 +-
 src/bin/pg_rewind/parsexlog.c |  39 ++---
 src/bin/pg_rewind/pg_rewind.c | 189 ++
 src/bin/pg_rewind/pg_rewind.h |  10 ++-
 4 files changed, 170 insertions(+), 70 deletions(-)

diff --git a/src/bin/pg_rewind/Makefile b/src/bin/pg_rewind/Makefile
index 92b5d20..48dc770 100644
--- a/src/bin/pg_rewind/Makefile
+++ b/src/bin/pg_rewind/Makefile
@@ -8,7 +8,7 @@
 #
 #-
 
-PGFILEDESC = "pg_rewind - repurpose an old master server as standby"
+PGFILEDESC = "pg_rewind - synchronize a data directory with another one forked from"
 PGAPPICON = win32
 
 subdir = src/bin/pg_rewind
diff --git a/src/bin/pg_rewind/parsexlog.c b/src/bin/pg_rewind/parsexlog.c
index 2081cf8..d69eafb 100644
--- a/src/bin/pg_rewind/parsexlog.c
+++ b/src/bin/pg_rewind/parsexlog.c
@@ -45,7 +45,7 @@ static char xlogfpath[MAXPGPATH];
 typedef struct XLogPageReadPrivate
 {
 	const char *datadir;
-	TimeLineID	tli;
+	int			tliIndex;
 } XLogPageReadPrivate;
 
 static int SimpleXLogPageRead(XLogReaderState *xlogreader,
@@ -55,11 +55,11 @@ static int SimpleXLogPageRead(XLogReaderState *xlogreader,
 
 /*
  * Read WAL from the datadir/pg_xlog, starting from 'startpoint' on timeline
- * 'tli', until 'endpoint'. Make note of the data blocks touched by the WAL
- * records, and return them in a page map.
+ * index 'tliIndex' in target timeline history, until 'endpoint'. Make note of
+ * the data blocks touched by the WAL records, and return them in a page map.
  */
 void
-extractPageMap(const char *datadir, XLogRecPtr startpoint, TimeLineID tli,
+extractPageMap(const char *datadir, XLogRecPtr startpoint, int tliIndex,
 			   XLogRecPtr endpoint)
 {
 	XLogRecord *record;
@@ -68,7 +68,7 @@ extractPageMap(const char *datadir, XLogRecPtr startpoint, TimeLineID tli,
 	XLogPageReadPrivate private;
 
 	private.datadir = datadir;
-	private.tli = tli;
+	private.tliIndex = tliIndex;
 	xlogreader = XLogReaderAllocate(, );
 	if (xlogreader == NULL)
 		pg_fatal("out of memory\n");
@@ -112,7 +112,7 @@ extractPageMap(const char *datadir, XLogRecPtr startpoint, TimeLineID tli,
  * doing anything with the record itself.
  */
 XLogRecPtr
-readOneRecord(const char *datadir, XLogRecPtr ptr, TimeLineID tli)
+readOneRecord(const char *datadir, XLogRecPtr ptr, int tliIndex)
 {
 	XLogRecord *record;
 	XLogReaderState *xlogreader;
@@ -121,7 +121,7 @@ readOneRecord(const char *datadir, XLogRecPtr ptr, TimeLineID tli)
 	XLogRecPtr	endptr;
 
 	private.datadir = datadir;
-	private.tli = tli;
+	private.tliIndex = tliIndex;
 	xlogreader = XLogReaderAllocate(, );
 	if (xlogreader == NULL)
 		pg_fatal("out of memory\n");
@@ -152,7 +152,7 @@ readOneRecord(const char *datadir, XLogRecPtr ptr, TimeLineID tli)
  * Find the previous checkpoint preceding given WAL position.
  

Re: [HACKERS] Use pg_rewind when target timeline was switched

2015-09-18 Thread Michael Paquier
On Fri, Sep 18, 2015 at 12:34 PM, Michael Paquier
 wrote:
> On Fri, Sep 18, 2015 at 9:00 AM, Alexander Korotkov wrote:
>> BTW, it would be an option to generate system_identifier to each new
>> timeline, by analogy of initdb do for the whole WAL.
>> Having such system_identifiers we can distinguish different timeline which
>> have assigned same ids.
>> Any thoughts?
>
> If you mean a new field incorporated in XLogLongPageHeader and
> ControlFile to ensure that a new timeline generated is unique across
> the same installation identified with system_identifier, then I'm not
> against it for something up to 2 bytes (even 1 byte would be fine),

Er, 2 bytes may be a bad idea as well, 1/16k% chance of collision
looks dangerous when rewinding a node... It could cause silent data
corruption on the standby being rewounded.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Use pg_rewind when target timeline was switched

2015-09-18 Thread Michael Paquier
On Fri, Sep 18, 2015 at 6:25 PM, Michael Paquier wrote:
> The refactoring of getTimelineHistory as you propose looks like a good
> idea to me, I tried to remove by myself the difference between source
> and target in copy_fetch.c and friends but this gets uglier,
> particularly because of datadir_source in copy_file_range. Not worth
> it.

Forgot that:
if (ControlFile_target.state != DB_SHUTDOWNED)
pg_fatal("target server must be shut down cleanly\n");
We may want to allow a target node shutdowned in recovery as well here.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Use pg_rewind when target timeline was switched

2015-09-18 Thread Michael Paquier
On Wed, Sep 16, 2015 at 9:47 AM, Alexander Korotkov wrote:
> On Thu, Sep 10, 2015 at 8:33 AM, Michael Paquier wrote:
> OK, I get it. I tried to get rid of code duplication in the attached patch.
> There is getTimelineHistory() function which gets control file as argument
> and fetches history from appropriate path.

Cool, thanks!

> Imagine following configuration of server.
>   1
>  / \
> 2   3
> |
> 4
>
> Initially, node 1 is master, nodes 2 and 3 are standbys for node 1. node 4
> is cascaded standby for node 3.
> Then node 2 and node 3 are promoted. They both got timeline number 2. Then
> node 3 is promoted and gets timeline number 3.

Typo. You mean node 4 getting timeline 3 here.

> Then we could try to rewind node 4 with node 2 as source. How pg_rewind
> could know that timeline number 2 for those nodes is not the same?
> We can only check that those timelines are forked from timeline 1 at the
> same place. But coincidence is still possible.

OK, I see your point and you are right. This additional check allows
pg_rewind to switch one timeline back and make the scan of blocks
begin at the real origin of both timelines. I had in mind the case
where you tried to actually rewind node 2 to node 3 actually which
would not be possible with your patch, and by thinking more about that
I think that it could be possible to remove the check I am listing
below and rely on the checks in the history files, basically what is
in findCommonAncestorTimeline:
if (ControlFile_target.checkPointCopy.ThisTimeLineID ==
ControlFile_source.checkPointCopy.ThisTimeLineID)
pg_fatal("source and target cluster are on the same
timeline\n");
Alexander, what do you think about that? I think that we should be
able to rewind with for example node 2 as target and node 3 as source,
and vice-versa as per the example you gave even if both nodes are on
the same timeline, just that they forked at a different point. Could
you test this particular case? Using my script, simply be sure to
switch archive_mode to on/off depending on the node, aka only 3 and 4
do archiving.

+   /*
+* Since incomplete segments are copied into next
timelines, find the
+* lastest timeline holding required segment. Assuming
we can move
+* in xlog both forward and backward, consider also
switching timeline
+* back.
+*/
s/lastest/correct? "lastest" does sound very English to me even if
that's grammatically correct.

+ * Find minimum from two xlog pointers assuming invalid pointer (0) means
+ * infinity as timeline.h states.
This needs an update, now timeline.h points out correctly that this is
not 0, but InvalidXLogRecPtr.

+   /* Retreive timelines for both source and target */
s/Retreive/Retrieve.

The refactoring of getTimelineHistory as you propose looks like a good
idea to me, I tried to remove by myself the difference between source
and target in copy_fetch.c and friends but this gets uglier,
particularly because of datadir_source in copy_file_range. Not worth
it.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Use pg_rewind when target timeline was switched

2015-09-18 Thread Alexander Korotkov
On Wed, Sep 16, 2015 at 7:47 PM, Alexander Korotkov <
a.korot...@postgrespro.ru> wrote:

> On Thu, Sep 10, 2015 at 8:33 AM, Michael Paquier <
> michael.paqu...@gmail.com> wrote:
>
>> On Wed, Sep 9, 2015 at 7:13 PM, Alexander Korotkov wrote:
>
> > A also added additional check in findCommonAncestorTimeline(). Two
>> standbys
>> > could be independently promoted and get the same new timeline ID. Now,
>> it's
>> > checked that timelines, that we assume to be the same, have equal
>> begins.
>> > Begins could still match by coincidence. But the same risk exists in
>> > original pg_rewind, though.
>>
>> Really? pg_rewind blocks attempts to rewind two nodes that are already
>> on the same timeline before checking if their timeline history map at
>> some point or not:
>> /*
>>  * If both clusters are already on the same timeline, there's
>> nothing to
>>  * do.
>>  */
>> if (ControlFile_target.checkPointCopy.ThisTimeLineID ==
>> ControlFile_source.checkPointCopy.ThisTimeLineID)
>> pg_fatal("source and target cluster are on the same
>> timeline\n");
>> And this seems really justified to me. Imagine that you have one
>> master, with two standbys linked to it. If both standbys are promoted
>> to the same timeline, you could easily replug them to the master, but
>> I fail to see the point to be able to replug one promoted standby with
>> the other in this case: those nodes have segment and history files
>> that map with each other, an operator could easily mess up things in
>> such a configuration.
>>
>
> Imagine following configuration of server.
>   1
>  / \
> 2   3
> |
> 4
>
> Initially, node 1 is master, nodes 2 and 3 are standbys for node 1. node 4
> is cascaded standby for node 3.
> Then node 2 and node 3 are promoted. They both got timeline number 2. Then
> node 3 is promoted and gets timeline number 3.
> Then we could try to rewind node 4 with node 2 as source. How pg_rewind
> could know that timeline number 2 for those nodes is not the same?
> We can only check that those timelines are forked from timeline 1 at the
> same place. But coincidence is still possible.
>

BTW, it would be an option to generate system_identifier to each new
timeline, by analogy of initdb do for the whole WAL.
Having such system_identifiers we can distinguish different timeline which
have assigned same ids.
Any thoughts?

--
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


Re: [HACKERS] Use pg_rewind when target timeline was switched

2015-09-18 Thread Michael Paquier
On Fri, Sep 18, 2015 at 9:00 AM, Alexander Korotkov wrote:
> BTW, it would be an option to generate system_identifier to each new
> timeline, by analogy of initdb do for the whole WAL.
> Having such system_identifiers we can distinguish different timeline which
> have assigned same ids.
> Any thoughts?

If you mean a new field incorporated in XLogLongPageHeader and
ControlFile to ensure that a new timeline generated is unique across
the same installation identified with system_identifier, then I'm not
against it for something up to 2 bytes (even 1 byte would be fine),
though it seems that there is a very narrow need for it. Do you have
cases in mind that could use it? Even in the case of pg_rewind, it
seems to me we would not need this extra timeline-based ID. Per your
case above, it is possible to rewind node 4 on timeline 3 using node 2
on timeline 2 when both timelines have forked exactly at the same
point from timeline 1, by having node 4 beginning to replay from
timeline 1 (last checkpoint record before WAL forked), and if I am
reading your patch correctly that's what you do.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Use pg_rewind when target timeline was switched

2015-09-16 Thread Alexander Korotkov
On Thu, Sep 10, 2015 at 8:33 AM, Michael Paquier 
wrote:

> On Wed, Sep 9, 2015 at 7:13 PM, Alexander Korotkov wrote:
> > On Wed, Sep 9, 2015 at 9:01 AM, Michael Paquier wrote:
> >> The code building the target history file is a duplicate of what is done
> >> with the source. Perhaps we could refactor that as a single routine in
> >> pg_rewind.c?
> >
> >
> > Do you mean duplication between rewind_parseTimeLineHistory() and
> > readTimeLineHistory(). We could put readTimeLineHistory() into common
> with
> > some refactoring. This is the subject of separate patch, though.
>
> Well, there is that :) But I was referring to this part (beware of
> useless newlines in your code btw):
> +   if (targettli == 1)
> {
> -   TimeLineHistoryEntry *entry = [i];
> +   targetHistory = (TimeLineHistoryEntry *)
> pg_malloc(sizeof(TimeLineHistoryEntry));
> +   targetHistory->tli = targettli;
> +   targetHistory->begin = targetHistory->end =
> InvalidXLogRecPtr;
> +   targetNentries = 1;
> +
> +   }
> Your patch generates a timeline history array for the target node, and
> HEAD does exactly the same for the source node, so IMO your patch is
> duplicating code, the only difference being that fetchFile is used for
> the source history file and slurpFile is used for the target history
> file. Up to now the code duplication caused by the difference that the
> target is always a local instance, and the source may be either local
> or remote has created the interface that we have now, but I think that
> we should refactor fetch.c such as the routines it contains do not
> rely anymore on datadir_source, and use instead a datadir argument. If
> datadir is NULL, the remote code path is used. If it is not NULL,
> local code path is used. This way we can make the target node use
> fetchFile only, and remove the duplication your patch is adding, as
> well as future ones like that. Thoughts?
>

OK, I get it. I tried to get rid of code duplication in the attached patch.
There is getTimelineHistory() function which gets control file as argument
and fetches history from appropriate path.


> >> Except that, the patch looks pretty neat to me. I was wondering as well:
> >> what tests did you run up to now with this patch? I am attaching an
> updated
> >> version of my test script I used for some more complex scenarios. Feel
> free
> >> to play with it.
> >
> > I was running manual tests like a noob :) When you attached you bash
> script,
> > I've switched to it.
>
> [product_placement]The patch to improve the recovery test suite
> submitted in this CF would have eased the need of bash-ing test cases
> here, and we could have test cases directly included in your
> patch.[/product_placement]
>
> > A also added additional check in findCommonAncestorTimeline(). Two
> standbys
> > could be independently promoted and get the same new timeline ID. Now,
> it's
> > checked that timelines, that we assume to be the same, have equal begins.
> > Begins could still match by coincidence. But the same risk exists in
> > original pg_rewind, though.
>
> Really? pg_rewind blocks attempts to rewind two nodes that are already
> on the same timeline before checking if their timeline history map at
> some point or not:
> /*
>  * If both clusters are already on the same timeline, there's
> nothing to
>  * do.
>  */
> if (ControlFile_target.checkPointCopy.ThisTimeLineID ==
> ControlFile_source.checkPointCopy.ThisTimeLineID)
> pg_fatal("source and target cluster are on the same
> timeline\n");
> And this seems really justified to me. Imagine that you have one
> master, with two standbys linked to it. If both standbys are promoted
> to the same timeline, you could easily replug them to the master, but
> I fail to see the point to be able to replug one promoted standby with
> the other in this case: those nodes have segment and history files
> that map with each other, an operator could easily mess up things in
> such a configuration.
>

Imagine following configuration of server.
  1
 / \
2   3
|
4

Initially, node 1 is master, nodes 2 and 3 are standbys for node 1. node 4
is cascaded standby for node 3.
Then node 2 and node 3 are promoted. They both got timeline number 2. Then
node 3 is promoted and gets timeline number 3.
Then we could try to rewind node 4 with node 2 as source. How pg_rewind
could know that timeline number 2 for those nodes is not the same?
We can only check that those timelines are forked from timeline 1 at the
same place. But coincidence is still possible.

--
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


pg-rewind-target-switch-5.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Use pg_rewind when target timeline was switched

2015-09-10 Thread Alvaro Herrera
Michael Paquier wrote:
> On Wed, Sep 9, 2015 at 7:13 PM, Alexander Korotkov wrote:

> > I was running manual tests like a noob :) When you attached you bash script,
> > I've switched to it.
> 
> [product_placement]The patch to improve the recovery test suite
> submitted in this CF would have eased the need of bash-ing test cases
> here, and we could have test cases directly included in your
> patch.[/product_placement]

The problem of bash test scripts is that they don't run on Windows,
which is why we settled on Perl-based TAP tests.  I think having tests
for various aspects of recovery is useful, because being relatively new
it's still being hacked in various ways and it's good to ensure that
things that were previously working continue to work.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Use pg_rewind when target timeline was switched

2015-09-09 Thread Michael Paquier
On Wed, Sep 9, 2015 at 3:27 AM, Alexander Korotkov
 wrote:

> On Tue, Sep 8, 2015 at 10:28 AM, Michael Paquier
> wrote:
>
>> I am planning to do as well a detailed code review rather soon.
>>
>
> Good, thanks.
>

When testing a bit more complex structures, it occurred to me that we may
want as well to use as a source node a standby. For example based on the
case of my cluster above:
 master (5432)
  /  \
 1 (5433)   2 (5434)
 |
 3 (5435)
If I try for example to rebuild the cluster as follows there will be
failures:
1) Rewind with source = 3, target = 2
2) Start 3 and 2
3) Shutdown 2
3) Rewind source = 2, target = 1, failure with:
source data directory must be shut down cleanly

It seems to me that we should allow a node that has been shutdowned in
recovery to be used as a source for rewind as well, as follows:
-   if (datadir_source && ControlFile_source.state != DB_SHUTDOWNED)
+   if (datadir_source &&
+   ControlFile_source.state != DB_SHUTDOWNED &&
+   ControlFile_source.state != DB_SHUTDOWNED_IN_RECOVERY)
pg_fatal("source data directory must be shut down
cleanly\n");
At least your patch justifies in my eyes such a change.

 /*
+ * Find minimum from two xlog pointers assuming invalid pointer is greatest
+ * possible pointer.
+ */
+static XLogRecPtr
+xlPtrMin(XLogRecPtr a, XLogRecPtr b)
+{
+   if (XLogRecPtrIsInvalid(a))
+   return b;
+   else if (XLogRecPtrIsInvalid(b))
+   return a;
+   else
+   return Min(a, b);
+}
This is true as timeline.h tells so, still I think that it would be better
to mention that this is this assumption is held in this header file, or
simply that timeline history entries at the top have their end position set
as InvalidXLogRecPtr which is a synonym of infinity.

The code building the target history file is a duplicate of what is done
with the source. Perhaps we could refactor that as a single routine in
pg_rewind.c?

Except that, the patch looks pretty neat to me. I was wondering as well:
what tests did you run up to now with this patch? I am attaching an updated
version of my test script I used for some more complex scenarios. Feel free
to play with it.
-- 
Michael


rewind_test.bash
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Use pg_rewind when target timeline was switched

2015-09-09 Thread Alexander Korotkov
On Wed, Sep 9, 2015 at 9:01 AM, Michael Paquier 
wrote:

>
>
> On Wed, Sep 9, 2015 at 3:27 AM, Alexander Korotkov
>  wrote:
>
>> On Tue, Sep 8, 2015 at 10:28 AM, Michael Paquier
>> wrote:
>>
>>> I am planning to do as well a detailed code review rather soon.
>>>
>>
>> Good, thanks.
>>
>
> When testing a bit more complex structures, it occurred to me that we may
> want as well to use as a source node a standby. For example based on the
> case of my cluster above:
>  master (5432)
>   /  \
>  1 (5433)   2 (5434)
>  |
>  3 (5435)
> If I try for example to rebuild the cluster as follows there will be
> failures:
> 1) Rewind with source = 3, target = 2
> 2) Start 3 and 2
> 3) Shutdown 2
> 3) Rewind source = 2, target = 1, failure with:
> source data directory must be shut down cleanly
>
> It seems to me that we should allow a node that has been shutdowned in
> recovery to be used as a source for rewind as well, as follows:
> -   if (datadir_source && ControlFile_source.state != DB_SHUTDOWNED)
> +   if (datadir_source &&
> +   ControlFile_source.state != DB_SHUTDOWNED &&
> +   ControlFile_source.state != DB_SHUTDOWNED_IN_RECOVERY)
> pg_fatal("source data directory must be shut down
> cleanly\n");
> At least your patch justifies in my eyes such a change.
>

It's source is not required to be √ in recovery if we specify it by
connection string. This is why I think DB_SHUTDOWNED_IN_RECOVERY is OK when
we specify source by data directory. Included in attached patch.

 /*
> + * Find minimum from two xlog pointers assuming invalid pointer is
> greatest
> + * possible pointer.
> + */
> +static XLogRecPtr
> +xlPtrMin(XLogRecPtr a, XLogRecPtr b)
> +{
> +   if (XLogRecPtrIsInvalid(a))
> +   return b;
> +   else if (XLogRecPtrIsInvalid(b))
> +   return a;
> +   else
> +   return Min(a, b);
> +}
> This is true as timeline.h tells so, still I think that it would be better
> to mention that this is this assumption is held in this header file, or
> simply that timeline history entries at the top have their end position set
> as InvalidXLogRecPtr which is a synonym of infinity.
>

Agree. Comment is adjusted.


> The code building the target history file is a duplicate of what is done
> with the source. Perhaps we could refactor that as a single routine in
> pg_rewind.c?
>

Do you mean duplication between rewind_parseTimeLineHistory()
and readTimeLineHistory(). We could put readTimeLineHistory() into common
with some refactoring. This is the subject of separate patch, though.

Except that, the patch looks pretty neat to me. I was wondering as well:
> what tests did you run up to now with this patch? I am attaching an updated
> version of my test script I used for some more complex scenarios. Feel free
> to play with it.
>

I was running manual tests like a noob :) When you attached you bash
script, I've switched to it.

A also added additional check in findCommonAncestorTimeline(). Two standbys
could be independently promoted and get the same new timeline ID. Now, it's
checked that timelines, that we assume to be the same, have equal begins.
Begins could still match by coincidence. But the same risk exists in
original pg_rewind, though.

--
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


pg-rewind-target-switch-4.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Use pg_rewind when target timeline was switched

2015-09-09 Thread Michael Paquier
On Wed, Sep 9, 2015 at 7:13 PM, Alexander Korotkov wrote:
> On Wed, Sep 9, 2015 at 9:01 AM, Michael Paquier wrote:
>> The code building the target history file is a duplicate of what is done
>> with the source. Perhaps we could refactor that as a single routine in
>> pg_rewind.c?
>
>
> Do you mean duplication between rewind_parseTimeLineHistory() and
> readTimeLineHistory(). We could put readTimeLineHistory() into common with
> some refactoring. This is the subject of separate patch, though.

Well, there is that :) But I was referring to this part (beware of
useless newlines in your code btw):
+   if (targettli == 1)
{
-   TimeLineHistoryEntry *entry = [i];
+   targetHistory = (TimeLineHistoryEntry *)
pg_malloc(sizeof(TimeLineHistoryEntry));
+   targetHistory->tli = targettli;
+   targetHistory->begin = targetHistory->end = InvalidXLogRecPtr;
+   targetNentries = 1;
+
+   }
Your patch generates a timeline history array for the target node, and
HEAD does exactly the same for the source node, so IMO your patch is
duplicating code, the only difference being that fetchFile is used for
the source history file and slurpFile is used for the target history
file. Up to now the code duplication caused by the difference that the
target is always a local instance, and the source may be either local
or remote has created the interface that we have now, but I think that
we should refactor fetch.c such as the routines it contains do not
rely anymore on datadir_source, and use instead a datadir argument. If
datadir is NULL, the remote code path is used. If it is not NULL,
local code path is used. This way we can make the target node use
fetchFile only, and remove the duplication your patch is adding, as
well as future ones like that. Thoughts?

>> Except that, the patch looks pretty neat to me. I was wondering as well:
>> what tests did you run up to now with this patch? I am attaching an updated
>> version of my test script I used for some more complex scenarios. Feel free
>> to play with it.
>
> I was running manual tests like a noob :) When you attached you bash script,
> I've switched to it.

[product_placement]The patch to improve the recovery test suite
submitted in this CF would have eased the need of bash-ing test cases
here, and we could have test cases directly included in your
patch.[/product_placement]

> A also added additional check in findCommonAncestorTimeline(). Two standbys
> could be independently promoted and get the same new timeline ID. Now, it's
> checked that timelines, that we assume to be the same, have equal begins.
> Begins could still match by coincidence. But the same risk exists in
> original pg_rewind, though.

Really? pg_rewind blocks attempts to rewind two nodes that are already
on the same timeline before checking if their timeline history map at
some point or not:
/*
 * If both clusters are already on the same timeline, there's nothing to
 * do.
 */
if (ControlFile_target.checkPointCopy.ThisTimeLineID ==
ControlFile_source.checkPointCopy.ThisTimeLineID)
pg_fatal("source and target cluster are on the same
timeline\n");
And this seems really justified to me. Imagine that you have one
master, with two standbys linked to it. If both standbys are promoted
to the same timeline, you could easily replug them to the master, but
I fail to see the point to be able to replug one promoted standby with
the other in this case: those nodes have segment and history files
that map with each other, an operator could easily mess up things in
such a configuration.
Regards,
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Use pg_rewind when target timeline was switched

2015-09-08 Thread Michael Paquier
On Tue, Sep 8, 2015 at 1:14 AM, Alexander Korotkov wrote:
> On Thu, Aug 20, 2015 at 9:57 AM, Michael Paquier wrote:
>> The code above looks correct to me when scanning the WAL history onwards
>> though, which is what is done when extracting the page map, but not
>> backwards when we try to find the last common checkpoint record. This code
>> actually fails trying to open 2/0/2 that does not exist in the promoted
>> standby's pg_xlog in my test case.
>>
>> Attached is a small script I have used to reproduce the failure.
>
>
> Right, thanks! It should be fixed in the attached version of patch.

So, instead of a code review, I have been torturing your patch and did
advanced tests on it with the attached script, that creates a cluster
as follows:
 master (5432)
  /\
 1 (5433)   2 (5434)
 |
 3 (5435)
Once cluster is created, nodes are promoted in a certain order giving
them different timeline jump properties:
- master, stays on tli 1
- standby 1, tli 1->2
- standby 2, tli 1->3
- standby 3, tli 1->2->4
And data is inserted on each of them to make WAL fork properly.
Finally the script tries to rewind one node using another node as
source, and then tries to link this target node back to the source
node via streaming replication.

I have tested all the one-one permutations possible in the structure
above (see commented portions at the bottom of my script), and all of
them worked. I have to say that from the testing prospective this
patch looks in great shape, and will greatly improve the use cases of
pg_rewind!

I am planning to do as well a detailed code review rather soon.
Regards,
-- 
Michael


rewind_test.bash
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Use pg_rewind when target timeline was switched

2015-09-08 Thread Alexander Korotkov
On Tue, Sep 8, 2015 at 10:28 AM, Michael Paquier 
wrote:

> On Tue, Sep 8, 2015 at 1:14 AM, Alexander Korotkov wrote:
> > On Thu, Aug 20, 2015 at 9:57 AM, Michael Paquier wrote:
> >> The code above looks correct to me when scanning the WAL history onwards
> >> though, which is what is done when extracting the page map, but not
> >> backwards when we try to find the last common checkpoint record. This
> code
> >> actually fails trying to open 2/0/2 that does not exist in the promoted
> >> standby's pg_xlog in my test case.
> >>
> >> Attached is a small script I have used to reproduce the failure.
> >
> >
> > Right, thanks! It should be fixed in the attached version of patch.
>
> So, instead of a code review, I have been torturing your patch and did
> advanced tests on it with the attached script, that creates a cluster
> as follows:
>  master (5432)
>   /\
>  1 (5433)   2 (5434)
>  |
>  3 (5435)
> Once cluster is created, nodes are promoted in a certain order giving
> them different timeline jump properties:
> - master, stays on tli 1
> - standby 1, tli 1->2
> - standby 2, tli 1->3
> - standby 3, tli 1->2->4
> And data is inserted on each of them to make WAL fork properly.
> Finally the script tries to rewind one node using another node as
> source, and then tries to link this target node back to the source
> node via streaming replication.
>
> I have tested all the one-one permutations possible in the structure
> above (see commented portions at the bottom of my script), and all of
> them worked. I have to say that from the testing prospective this
> patch looks in great shape, and will greatly improve the use cases of
> pg_rewind!
>

Great! Many thanks for your testing!

I am planning to do as well a detailed code review rather soon.
>

Good, thanks.

--
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


Re: [HACKERS] Use pg_rewind when target timeline was switched

2015-09-07 Thread Alexander Korotkov
On Thu, Aug 20, 2015 at 9:57 AM, Michael Paquier 
wrote:

> On Wed, Jul 22, 2015 at 4:28 PM, Alexander Korotkov <
> a.korot...@postgrespro.ru> wrote:
>
>> On Wed, Jul 22, 2015 at 8:48 AM, Michael Paquier <
>> michael.paqu...@gmail.com> wrote
>>
>>> On Mon, Jul 20, 2015 at 9:18 PM, Alexander Korotkov
>>>  wrote:
>>> > attached patch allows pg_rewind to work when target timeline was
>>> switched.
>>> > Actually, this patch fixes TODO from pg_rewind comments.
>>> >
>>> >   /*
>>> >* Trace the history backwards, until we hit the target timeline.
>>> >*
>>> >* TODO: This assumes that there are no timeline switches on the
>>> target
>>> >* cluster after the fork.
>>> >*/
>>> >
>>> > This patch allows pg_rewind to handle data directory synchronization
>>> is much
>>> > more general way.
>>> For instance, user can return promoted standby to old master.
>>
>>
> +   /*
> +* Since incomplete segments are copied into next
> timelines, find the
> +* lastest timeline holding required segment.
> +*/
> +   while (private->tliIndex < targetNentries - 1 &&
> +  targetHistory[private->tliIndex].end <
> targetSegEnd)
> +   {
> +   private->tliIndex++;
> +   tli_index++;
> +   }
> It seems to me that the patch is able to handle timeline switches onwards,
> but not backwards and this is what would be required to return a promoted
> standby, that got switched to let's say timeline 2 when promoted, to an old
> master, that is still on timeline 1. This code actually fails when scanning
> for the last checkpoint before WAL forked as it will be on the timeline 1
> of the old master. Imagine for example that the WAL has forked at
> 0/30X which is saved in segment 00020003 (say 2/0/3) on
> the promoted standby, and that the last checkpoint record is on 0/20X,
> which is part of 00010002 (1/0/2). I think that we should
> scan 2/0/3 (not the partial segment 1/0/3), and then 1/0/2 when looking for
> the last checkpoint record. Hence the startup index TLI should be set to
> the highest timeline and should be decremented depending on what is in the
> history file.
> The code above looks correct to me when scanning the WAL history onwards
> though, which is what is done when extracting the page map, but not
> backwards when we try to find the last common checkpoint record. This code
> actually fails trying to open 2/0/2 that does not exist in the promoted
> standby's pg_xlog in my test case.
>
> Attached is a small script I have used to reproduce the failure.
>

Right, thanks! It should be fixed in the attached version of patch.

I think that the documentation needs a brush up as well to outline the fact
> that pg_rewind would be able to put back as well a standby in a cluster,
> after for example an operator mistake when promoting a node that should not
> be.
>

Yes. But from current pg_rewind documentation I can't figure out that it
can't do so. However, we can add an explicit statement which claims that it
can.

--
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


pg-rewind-target-switch-3.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Use pg_rewind when target timeline was switched

2015-08-20 Thread Michael Paquier
On Wed, Jul 22, 2015 at 4:28 PM, Alexander Korotkov 
a.korot...@postgrespro.ru wrote:

 On Wed, Jul 22, 2015 at 8:48 AM, Michael Paquier 
 michael.paqu...@gmail.com wrote

 On Mon, Jul 20, 2015 at 9:18 PM, Alexander Korotkov
 a.korot...@postgrespro.ru wrote:
  attached patch allows pg_rewind to work when target timeline was
 switched.
  Actually, this patch fixes TODO from pg_rewind comments.
 
/*
 * Trace the history backwards, until we hit the target timeline.
 *
 * TODO: This assumes that there are no timeline switches on the
 target
 * cluster after the fork.
 */
 
  This patch allows pg_rewind to handle data directory synchronization is
 much
  more general way.
 For instance, user can return promoted standby to old master.


+   /*
+* Since incomplete segments are copied into next
timelines, find the
+* lastest timeline holding required segment.
+*/
+   while (private-tliIndex  targetNentries - 1 
+  targetHistory[private-tliIndex].end 
targetSegEnd)
+   {
+   private-tliIndex++;
+   tli_index++;
+   }
It seems to me that the patch is able to handle timeline switches onwards,
but not backwards and this is what would be required to return a promoted
standby, that got switched to let's say timeline 2 when promoted, to an old
master, that is still on timeline 1. This code actually fails when scanning
for the last checkpoint before WAL forked as it will be on the timeline 1
of the old master. Imagine for example that the WAL has forked at
0/30X which is saved in segment 00020003 (say 2/0/3) on
the promoted standby, and that the last checkpoint record is on 0/20X,
which is part of 00010002 (1/0/2). I think that we should
scan 2/0/3 (not the partial segment 1/0/3), and then 1/0/2 when looking for
the last checkpoint record. Hence the startup index TLI should be set to
the highest timeline and should be decremented depending on what is in the
history file.
The code above looks correct to me when scanning the WAL history onwards
though, which is what is done when extracting the page map, but not
backwards when we try to find the last common checkpoint record. This code
actually fails trying to open 2/0/2 that does not exist in the promoted
standby's pg_xlog in my test case.

Attached is a small script I have used to reproduce the failure.

I think that the documentation needs a brush up as well to outline the fact
that pg_rewind would be able to put back as well a standby in a cluster,
after for example an operator mistake when promoting a node that should not
be.
Thoughts?
-- 
Michael


rewind_test.bash
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Use pg_rewind when target timeline was switched

2015-07-22 Thread Alexander Korotkov
On Wed, Jul 22, 2015 at 8:48 AM, Michael Paquier michael.paqu...@gmail.com
wrote:

 On Mon, Jul 20, 2015 at 9:18 PM, Alexander Korotkov
 a.korot...@postgrespro.ru wrote:
  attached patch allows pg_rewind to work when target timeline was
 switched.
  Actually, this patch fixes TODO from pg_rewind comments.
 
/*
 * Trace the history backwards, until we hit the target timeline.
 *
 * TODO: This assumes that there are no timeline switches on the target
 * cluster after the fork.
 */
 
  This patch allows pg_rewind to handle data directory synchronization is
 much
  more general way. For instance, user can return promoted standby to old
  master.

 Yes. That would be useful.

  In this patch target timeline history is exposed as global variable.
 Index
  in target timeline history is used in function interfaces instead of
  specifying TLI directly. Thus, SimpleXLogPageRead() can easily start
 reading
  XLOGs from next timeline when current timeline ends.

 The implementation looks rather neat by having a first look at it, but
 the attached patch fails to compile:
 pg_rewind.c:488:4: error: use of undeclared identifier 'targetEntry'
 targetEntry = targetHistory[i];
 ^
 Nice to see as well that this has been added to the CF of September.


Uh, sorry. Fixed version is attached.

--
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


pg-rewind-target-switch-2.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Use pg_rewind when target timeline was switched

2015-07-21 Thread Michael Paquier
On Mon, Jul 20, 2015 at 9:18 PM, Alexander Korotkov
a.korot...@postgrespro.ru wrote:
 attached patch allows pg_rewind to work when target timeline was switched.
 Actually, this patch fixes TODO from pg_rewind comments.

   /*
* Trace the history backwards, until we hit the target timeline.
*
* TODO: This assumes that there are no timeline switches on the target
* cluster after the fork.
*/

 This patch allows pg_rewind to handle data directory synchronization is much
 more general way. For instance, user can return promoted standby to old
 master.

Yes. That would be useful.

 In this patch target timeline history is exposed as global variable. Index
 in target timeline history is used in function interfaces instead of
 specifying TLI directly. Thus, SimpleXLogPageRead() can easily start reading
 XLOGs from next timeline when current timeline ends.

The implementation looks rather neat by having a first look at it, but
the attached patch fails to compile:
pg_rewind.c:488:4: error: use of undeclared identifier 'targetEntry'
targetEntry = targetHistory[i];
^
Nice to see as well that this has been added to the CF of September.
Regards,
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Use pg_rewind when target timeline was switched

2015-07-20 Thread Alexander Korotkov
Hackers,

attached patch allows pg_rewind to work when target timeline was switched.
Actually, this patch fixes TODO from pg_rewind comments.

  /*
   * Trace the history backwards, until we hit the target timeline.
   *
   * TODO: This assumes that there are no timeline switches on the target
   * cluster after the fork.
   */

This patch allows pg_rewind to handle data directory synchronization is
much more general way. For instance, user can return promoted standby to
old master.

In this patch target timeline history is exposed as global variable. Index
in target timeline history is used in function interfaces instead of
specifying TLI directly. Thus, SimpleXLogPageRead() can easily start
reading XLOGs from next timeline when current timeline ends.

--
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


pg_rewind_target_switch.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers