Re: [HACKERS] [PATCH] Logical decoding timeline following take II

2016-11-28 Thread Craig Ringer
It's unlikely this will get in as a standalone patch, so I'm closing
the CF entry for it as RwF

https://commitfest.postgresql.org/11/779/

It is now being tracked as part of logical decoding on standby at

https://commitfest.postgresql.org/12/788/

in thread "Logical decoding on standby", which began as
https://www.postgresql.org/message-id/flat/CAMsr+YFi-LV7S8ehnwUiZnb=1h_14PwQ25d-vyUNq-f5S5r=z...@mail.gmail.com#CAMsr+YFi-LV7S8ehnwUiZnb=1h_14PwQ25d-vyUNq-f5S5r=z...@mail.gmail.com
.

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Logical decoding timeline following take II

2016-11-15 Thread Craig Ringer
On 16 November 2016 at 12:44, Craig Ringer  wrote:

> Despite that, I've attached a revised version of the current approach
> incorporating fixes for the issues you mentioned. It's preceded by the
> patch to add an --endpos option to pg_recvlogical so that we can
> properly test the walsender interface too.

I didn't rebase the patch that made the timeline following tests use
the recvlogical support in PostgreNode.pm. Now attached.

Even if timeline following isn't accepted as-is, I'd greatly
appreciate inclusion of the first two patches as they add basic
coverage of pg_recvlogical and a helper to make using it in tests
simple and reliable.

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services
From b1266ad1adba8619bf43d4297d1ed6392e302198 Mon Sep 17 00:00:00 2001
From: Craig Ringer 
Date: Thu, 1 Sep 2016 12:37:40 +0800
Subject: [PATCH 1/3] Add an optional --endpos LSN argument to pg_recvlogical
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

pg_recvlogical usually just runs until cancelled or until the upstream
server disconnects. For some purposes, especially testing, it's useful
to have the ability to stop receive at a specified LSN without having
to parse the output and deal with buffering issues, etc.

Add a --endpos parameter that takes the LSN at which no further
messages should be written and receive should stop.

Craig Ringer, Álvaro Herrera
---
 doc/src/sgml/ref/pg_recvlogical.sgml   |  34 
 src/bin/pg_basebackup/pg_recvlogical.c | 145 +
 2 files changed, 164 insertions(+), 15 deletions(-)

diff --git a/doc/src/sgml/ref/pg_recvlogical.sgml b/doc/src/sgml/ref/pg_recvlogical.sgml
index b35881f..d066ce8 100644
--- a/doc/src/sgml/ref/pg_recvlogical.sgml
+++ b/doc/src/sgml/ref/pg_recvlogical.sgml
@@ -38,6 +38,14 @@ PostgreSQL documentation
constraints as , plus those for logical
replication (see ).
   
+
+  
+   pg_recvlogical has no equivalent to the logical decoding
+   SQL interface's peek and get modes. It sends replay confirmations for
+   data lazily as it receives it and on clean exit. To examine pending data on
+a slot without consuming it, use
+   pg_logical_slot_peek_changes.
+  
  
 
  
@@ -155,6 +163,32 @@ PostgreSQL documentation
  
 
  
+  -E lsn
+  --endpos=lsn
+  
+   
+In --start mode, automatically stop replication
+and exit with normal exit status 0 when receiving reaches the
+specified LSN.  If specified when not in --start
+mode, an error is raised.
+   
+
+   
+If there's a record with LSN exactly equal to lsn,
+the record will be output.
+   
+
+   
+The --endpos option is not aware of transaction
+boundaries and may truncate output partway through a transaction.
+Any partially output transaction will not be consumed and will be
+replayed again when the slot is next read from. Individual messages
+are never truncated.
+   
+  
+ 
+
+ 
   --if-not-exists
   

diff --git a/src/bin/pg_basebackup/pg_recvlogical.c b/src/bin/pg_basebackup/pg_recvlogical.c
index cb5f989..c700edf 100644
--- a/src/bin/pg_basebackup/pg_recvlogical.c
+++ b/src/bin/pg_basebackup/pg_recvlogical.c
@@ -40,6 +40,7 @@ static int	noloop = 0;
 static int	standby_message_timeout = 10 * 1000;		/* 10 sec = default */
 static int	fsync_interval = 10 * 1000; /* 10 sec = default */
 static XLogRecPtr startpos = InvalidXLogRecPtr;
+static XLogRecPtr endpos = InvalidXLogRecPtr;
 static bool do_create_slot = false;
 static bool slot_exists_ok = false;
 static bool do_start_slot = false;
@@ -63,6 +64,9 @@ static XLogRecPtr output_fsync_lsn = InvalidXLogRecPtr;
 static void usage(void);
 static void StreamLogicalLog(void);
 static void disconnect_and_exit(int code);
+static bool flushAndSendFeedback(PGconn *conn, TimestampTz *now);
+static void prepareToTerminate(PGconn *conn, XLogRecPtr endpos,
+   bool keepalive, XLogRecPtr lsn);
 
 static void
 usage(void)
@@ -81,6 +85,7 @@ usage(void)
 			 " time between fsyncs to the output file (default: %d)\n"), (fsync_interval / 1000));
 	printf(_("  --if-not-existsdo not error if slot already exists when creating a slot\n"));
 	printf(_("  -I, --startpos=LSN where in an existing slot should the streaming start\n"));
+	printf(_("  -E, --endpos=LSN   exit after receiving the specified LSN\n"));
 	printf(_("  -n, --no-loop  do not loop on connection lost\n"));
 	printf(_("  -o, --option=NAME[=VALUE]\n"
 			 " pass option NAME with optional value VALUE to the\n"
@@ -281,6 +286,7 @@ StreamLogicalLog(void)
 		int			bytes_written;
 		int64		now;
 		int			hdr_len;
+		XLogRecPtr	cur_record_lsn = InvalidXLogRecPtr;
 
 		if (copybuf != NULL)
 		{
@@ 

Re: [HACKERS] [PATCH] Logical decoding timeline following take II

2016-11-15 Thread Craig Ringer
On 12 November 2016 at 23:07, Andres Freund  wrote
> On 2016-10-24 17:49:13 +0200, Petr Jelinek wrote:
>> + * Determine which timeline to read an xlog page from and set the
>> + * XLogReaderState's state->currTLI to that timeline ID.
>
> "XLogReaderState's state->currTLI" - the state's a bit redundant.

Thanks for taking a look at the patch.

Agreed re above, fixed.



You know, having returned to this work after a long break doing other
things, it's clear that so much of the same work is being done in
XLogSendPhysical(...) and walsender.c's XLogRead(...) that maybe it is
worth facing the required refactoring so that we can use that in
logical decoding from both the walsender and the SQL interface.

The approach I originally took focused above all else on being
minimally intrusive, rather than clean and clear. Maybe it's better to
tidy this up instead.

Instead of coupling timeline following in logical decoding to the
XLogReader struct and having effectively duplicated logic to that for
physical walsender, move the walsender.c globals

static TimeLineID sendTimeLine = 0;
static TimeLineID sendTimeLineNextTLI = 0;
static bool sendTimeLineIsHistoric = false;
static XLogRecPtr sendTimeLineValidUpto = InvalidXLogRecPtr;

into new strut in timeline.h and move the logic into timeline.c . So
we stop relying on so many magic globals in the walsender. Don't keep
this state as a global, instead init it in StartReplication and pass
it as a param to WalSndLoop. For logical decoding from walsender,
store the walsender's copy in the XLogReaderState. For logical
decoding from SQL, set it up in pg_logical_slot_get_changes_guts(...)
and again store it in XLogReaderState.

In the process, finally unify the two XLogRead(...) functions in
xlogutils.c and walsender.c and merge walsender's
logical_read_xlog_page(...) with xlogutils'
logical_read_xlog_page(...) . Sure, we can't rely on a normal
backend's latch being set when wal is written like we can for the
walsender, but do we have to duplicate everything else? Can always add
a miscadmin.h var like IsWALSender and test that to see if we can
expect to have our latch set when new WAL comes in and adjust our
latch wait timeout interval appropriately.

The downside is of course that it touches physical replication, unlike
the current approach which avoids touching anything that could affect
physical replication at the cost of increasing the duplication between
physical and logical replication logic.

>> + * The caller must also make sure it doesn't read past the current redo 
>> pointer
>> + * so it doesn't fail to notice that the current timeline became historical.
>> + */
>
> Not sure what that means? The redo pointer usually menas the "logical
> start" of the last checkpoint, but I don't see how thta could be meant
> here?

What I was trying to say was the current replay position on a standby. Amended.

>> +static void
>> +XLogReadDetermineTimeline(XLogReaderState *state, XLogRecPtr wantPage, 
>> uint32 wantLength)
>> +{
>> + const XLogRecPtr lastReadPage = state->readSegNo * XLogSegSize + 
>> state->readOff;
>> +
>> + elog(DEBUG4, "Determining timeline for read at %X/%X+%X",
>> + (uint32)(wantPage>>32), (uint32)wantPage, wantLength);
>
> Doing this on every single read from a page seems quite verbose.   It's
> also (like most or all the following debug elogs) violating the casing
> prescribed by the message style guidelines.

Agreed. It's unnecessary now. It's the sort of thing I'd want to keep
to have if Pg had fine grained module level log control, but we don't.

>> + /*
>> +  * If we're reading from the current timeline, it hasn't become 
>> historical
>> +  * and the page we're reading is after the last page read, we can again
>> +  * just carry on. (Seeking backwards requires a check to make sure the 
>> older
>> +  * page isn't on a prior timeline).
>> +  */
>
> How can ThisTimeLineID become historical?

It can't, right now. Though I admit I didn't realise that at the time.
Presently ThisTimeLineID is only set in the walsender by
GetStandbyFlushRecPtr as called by IdentifySystem, and in
StartReplication, but not afterwards for logical replication, so
logical rep won't notice timeline transitions when on a standby unless
a decoding session is restarted. We don't support decoding sessions in
recovery, so it can't happen yet.

It will, when we're on a cascading standby and the upstream is
promoted, once we support logical decoding on standby. As part of that
we'll need to maintain the timeline in the walsender in logical
decoding like we do in physical, and limit logical decoding to the
currently replayed position with something like walsender's
GetStandbyFlushRecPtr(). But usable form the SQL interface too, of
course.

I'm currently implementing logical decoding on standby on top of this
and the catalog_xmin feedback patch, and in the process will be adding
tests for logical decoding on a physical 

Re: [HACKERS] [PATCH] Logical decoding timeline following take II

2016-11-12 Thread Andres Freund
On 2016-10-24 17:49:13 +0200, Petr Jelinek wrote:
> + * Determine which timeline to read an xlog page from and set the
> + * XLogReaderState's state->currTLI to that timeline ID.

"XLogReaderState's state->currTLI" - the state's a bit redundant.

> + * The caller must also make sure it doesn't read past the current redo 
> pointer
> + * so it doesn't fail to notice that the current timeline became historical.
> + */

Not sure what that means? The redo pointer usually menas the "logical
start" of the last checkpoint, but I don't see how thta could be meant
here?

> +static void
> +XLogReadDetermineTimeline(XLogReaderState *state, XLogRecPtr wantPage, 
> uint32 wantLength)
> +{
> + const XLogRecPtr lastReadPage = state->readSegNo * XLogSegSize + 
> state->readOff;
> +
> + elog(DEBUG4, "Determining timeline for read at %X/%X+%X",
> + (uint32)(wantPage>>32), (uint32)wantPage, wantLength);

Doing this on every single read from a page seems quite verbose.   It's
also (like most or all the following debug elogs) violating the casing
prescribed by the message style guidelines.


> + /*
> +  * If the desired page is currently read in and valid, we have nothing 
> to do.
> +  *
> +  * The caller should've ensured that it didn't previously advance 
> readOff
> +  * past the valid limit of this timeline, so it doesn't matter if the 
> current
> +  * TLI has since become historical.
> +  */
> + if (lastReadPage == wantPage &&
> + state->readLen != 0 &&
> + lastReadPage + state->readLen >= wantPage + 
> Min(wantLength,XLOG_BLCKSZ-1))
> + {
> + elog(DEBUG4, "Wanted data already valid"); //XXX
> + return;
> + }

With that kind of comment/XXX present, this surely can't be ready for
committer?


> + /*
> +  * If we're reading from the current timeline, it hasn't become 
> historical
> +  * and the page we're reading is after the last page read, we can again
> +  * just carry on. (Seeking backwards requires a check to make sure the 
> older
> +  * page isn't on a prior timeline).
> +  */

How can ThisTimeLineID become historical?

> + if (state->currTLI == ThisTimeLineID && wantPage >= lastReadPage)
> + {
> + Assert(state->currTLIValidUntil == InvalidXLogRecPtr);
> + elog(DEBUG4, "On current timeline");
> + return;
> + }

Also, is it actually ok to rely on ThisTimeLineID here? That's IIRC not
maintained outside of the startup process, until recovery ended (cf it
being set in InitXLOGAccess() called via RecoveryInProgress()).


>   /*
> -  * TODO: we're going to have to do something more intelligent 
> about
> -  * timelines on standbys. Use readTimeLineHistory() and
> -  * tliOfPointInHistory() to get the proper LSN? For now we'll 
> catch
> -  * that case earlier, but the code and TODO is left in here for 
> when
> -  * that changes.
> +  * Check which timeline to get the record from.
> +  *
> +  * We have to do it each time through the loop because if we're 
> in
> +  * recovery as a cascading standby, the current timeline 
> might've
> +  * become historical.
>*/

I guess you mean cascading as "replicating logically from a physical
standby"? I don't think it's good to use cascading here, because that's
usually thought to mean doing physical SR from a standby...



> diff --git a/src/backend/replication/logical/logicalfuncs.c 
> b/src/backend/replication/logical/logicalfuncs.c
> index 318726e..4315fb3 100644
> --- a/src/backend/replication/logical/logicalfuncs.c
> +++ b/src/backend/replication/logical/logicalfuncs.c
> @@ -234,12 +234,6 @@ pg_logical_slot_get_changes_guts(FunctionCallInfo 
> fcinfo, bool confirm, bool bin
>   rsinfo->setResult = p->tupstore;
>   rsinfo->setDesc = p->tupdesc;
>  
> - /* compute the current end-of-wal */
> - if (!RecoveryInProgress())
> - end_of_wal = GetFlushRecPtr();
> - else
> - end_of_wal = GetXLogReplayRecPtr(NULL);
> -
>   ReplicationSlotAcquire(NameStr(*name));
>  
>   PG_TRY();
> @@ -279,6 +273,12 @@ pg_logical_slot_get_changes_guts(FunctionCallInfo 
> fcinfo, bool confirm, bool bin
>   /* invalidate non-timetravel entries */
>   InvalidateSystemCaches();
>  
> + if (!RecoveryInProgress())
> + end_of_wal = GetFlushRecPtr();
> + else
> + end_of_wal = GetXLogReplayRecPtr(NULL);
> +
> + /* Decode until we run out of records */
>   while ((startptr != InvalidXLogRecPtr && startptr < end_of_wal) 
> ||
>  (ctx->reader->EndRecPtr != InvalidXLogRecPtr && 
> ctx->reader->EndRecPtr < end_of_wal))
>   {

That seems like a pretty random change?


> diff --git 

Re: [HACKERS] [PATCH] Logical decoding timeline following take II

2016-10-24 Thread Craig Ringer
On 24 October 2016 at 23:49, Petr Jelinek  wrote:
> Hi Craig,
>
> On 01/09/16 06:08, Craig Ringer wrote:
>> Hi all
>>
>> Attached is a rebased and updated logical decoding timeline following
>> patch for 10.0.
>>
>> This is a pre-requisite for the pending work on logical decoding on
>> standby servers and simplified failover of logical decoding.
>>
>
> I went over this and it looks fine to me, I only rebased the patch on
> top of current version (we renamed pg_xlog which broke the tests) and
> split the test harness to separate patch. Otherwise I would consider
> this to be ready for committer.
>
> I think this should go in early so that there is enough time in the
> cycle to uncover potential issues if there are any, even though it looks
> all correct to me.

Thanks for the review and update.

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Logical decoding timeline following take II

2016-10-24 Thread Petr Jelinek
Hi Craig,

On 01/09/16 06:08, Craig Ringer wrote:
> Hi all
> 
> Attached is a rebased and updated logical decoding timeline following
> patch for 10.0.
> 
> This is a pre-requisite for the pending work on logical decoding on
> standby servers and simplified failover of logical decoding.
> 

I went over this and it looks fine to me, I only rebased the patch on
top of current version (we renamed pg_xlog which broke the tests) and
split the test harness to separate patch. Otherwise I would consider
this to be ready for committer.

I think this should go in early so that there is enough time in the
cycle to uncover potential issues if there are any, even though it looks
all correct to me.

> 
> The test harness code will become unnecessary when proper support for
> logical failover or logical decoding on standby is added, so I'm not
> really sure it should be committed.

Yeah as I said above I split out the test harness for this reason. The
good thing is that when the followup patches get in the test harness
should be easy to removed as the changes are very localized.

-- 
  Petr Jelinek  http://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training & Services
From 3c775ee99820ea3e915e1a859fa399651e170ffc Mon Sep 17 00:00:00 2001
From: Petr Jelinek 
Date: Mon, 24 Oct 2016 17:40:40 +0200
Subject: [PATCH 1/2] Follow timeline switches in logical decoding

When decoding from a logical slot, it's necessary for xlog reading to
be able to read xlog from historical (i.e. not current) timelines.
Otherwise decoding fails after failover to a physical replica because
the oldest still-needed archives are in the historical timeline.

Supporting logical decoding timeline following is a pre-requisite for
logical decoding on physical standby servers. It also makes it
possible to promote a replica with logical slots to a master and
replay from those slots, allowing logical decoding applications to
follow physical failover.

Logical slots cannot actually be created on a replica without use of
the low-level C slot management APIs so this is mostly foundation work
for subsequent changes to enable logical decoding on standbys.

This commit includes a module in src/test/modules with functions to
manipulate the slots (which is not otherwise possible in SQL code) in
order to enable testing, and a new test in src/test/recovery to ensure
that the behavior is as expected.

Note that an earlier version of logical decoding timeline following
was committed to 9.5 as 24c5f1a103ce, 3a3b309041b0, 82c83b337202, and
f07d18b6e94d. It was then reverted by c1543a81a7a8 just after 9.5
feature freeze when issues were discovered too late to safely fix them
in the 9.5 release cycle.

The prior approach failed to consider that a record could be split
across pages that are on different segments, where the new segment
contains the start of a new timeline. In that case the old segment
might be missing or renamed with a .partial suffix.

This patch reworks the logic to be page-based and in the process
simplify how the last timeline for a segment is looked up.

Slot timeline following only works in a backend. Frontend support can
be aded separately, where it could be useful for pg_xlogdump etc once
support for timeline.c, List, etc is added for frontend code.
---
 src/backend/access/transam/xlogutils.c | 207 +++--
 src/backend/replication/logical/logicalfuncs.c |  12 +-
 src/include/access/xlogreader.h|  11 ++
 3 files changed, 209 insertions(+), 21 deletions(-)

diff --git a/src/backend/access/transam/xlogutils.c 
b/src/backend/access/transam/xlogutils.c
index 51a8e8d..014978f 100644
--- a/src/backend/access/transam/xlogutils.c
+++ b/src/backend/access/transam/xlogutils.c
@@ -19,6 +19,7 @@
 
 #include 
 
+#include "access/timeline.h"
 #include "access/xlog.h"
 #include "access/xlog_internal.h"
 #include "access/xlogutils.h"
@@ -660,6 +661,7 @@ XLogRead(char *buf, TimeLineID tli, XLogRecPtr startptr, 
Size count)
/* state maintained across calls */
static int  sendFile = -1;
static XLogSegNo sendSegNo = 0;
+   static TimeLineID sendTLI = 0;
static uint32 sendOff = 0;
 
p = buf;
@@ -675,7 +677,8 @@ XLogRead(char *buf, TimeLineID tli, XLogRecPtr startptr, 
Size count)
startoff = recptr % XLogSegSize;
 
/* Do we need to switch to a different xlog segment? */
-   if (sendFile < 0 || !XLByteInSeg(recptr, sendSegNo))
+   if (sendFile < 0 || !XLByteInSeg(recptr, sendSegNo) ||
+   sendTLI != tli)
{
charpath[MAXPGPATH];
 
@@ -702,6 +705,7 @@ XLogRead(char *buf, TimeLineID tli, XLogRecPtr startptr, 
Size count)
path)));
}
sendOff = 0;
+   sendTLI = tli;
  

Re: [HACKERS] [PATCH] Logical decoding timeline following take II

2016-10-02 Thread Michael Paquier
On Thu, Sep 1, 2016 at 1:08 PM, Craig Ringer  wrote:
> Attached is a rebased and updated logical decoding timeline following
> patch for 10.0.

Moved to next CF, nothing has happened since submission.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] [PATCH] Logical decoding timeline following take II

2016-08-31 Thread Craig Ringer
Hi all

Attached is a rebased and updated logical decoding timeline following
patch for 10.0.

This is a pre-requisite for the pending work on logical decoding on
standby servers and simplified failover of logical decoding.

Restating the commit message:
__

Follow timeline switches in logical decoding

When decoding from a logical slot, it's necessary for xlog reading to
be able to read xlog from historical (i.e. not current) timelines.
Otherwise decoding fails after failover to a physical replica because
the oldest still-needed archives are in the historical timeline.

Supporting logical decoding timeline following is a pre-requisite for
logical decoding on physical standby servers. It also makes it
possible to promote a replica with logical slots to a master and
replay from those slots, allowing logical decoding applications to
follow physical failover.

Logical slots cannot actually be created on a replica without use of
the low-level C slot management APIs so this is mostly foundation work
for subsequent changes to enable logical decoding on standbys.

This commit includes a module in src/test/modules with functions to
manipulate the slots (which is not otherwise possible in SQL code) in
order to enable testing, and a new test in src/test/recovery to ensure
that the behavior is as expected.

Note that an earlier version of logical decoding timeline following
was committed to 9.5 as 24c5f1a103ce, 3a3b309041b0, 82c83b337202, and
f07d18b6e94d. It was then reverted by c1543a81a7a8 just after 9.5
feature freeze when issues were discovered too late to safely fix them
in the 9.5 release cycle.

The prior approach failed to consider that a record could be split
across pages that are on different segments, where the new segment
contains the start of a new timeline. In that case the old segment
might be missing or renamed with a .partial suffix.

This patch reworks the logic to be page-based and in the process
simplify how the last timeline for a segment is looked up.

Slot timeline following only works in a backend. Frontend support can
be aded separately, where it could be useful for pg_xlogdump etc once
support for timeline.c, List, etc is added for frontend code.
__


I'm hoping to find time to refactor timeline following so that we
avoid passing timeline information around the xlogreader using
globals, but that'd be a separate change that can be made after this.

I've omitted the --endpos changes for pg_recvlogical, which again can
be added separately.

The test harness code will become unnecessary when proper support for
logical failover or logical decoding on standby is added, so I'm not
really sure it should be committed.

Prior threads:

* 
https://www.postgresql.org/message-id/camsr+yg_1fu_-l8qwsk6okft4jt8dpory2rhxdymy0b5zfk...@mail.gmail.com

* 
https://www.postgresql.org/message-id/CAMsr+YH-C1-X_+s=2nzapnr0wwqja-rumvhsyyzansn93mu...@mail.gmail.com

* http://www.postgresql.org/message-id/20160503165812.GA29604@alvherre.pgsql


-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services
From 10d5f4652689fbcaf5b14fe6bd991c98dbf60e00 Mon Sep 17 00:00:00 2001
From: Craig Ringer 
Date: Thu, 1 Sep 2016 10:16:55 +0800
Subject: [PATCH] Follow timeline switches in logical decoding

When decoding from a logical slot, it's necessary for xlog reading to
be able to read xlog from historical (i.e. not current) timelines.
Otherwise decoding fails after failover to a physical replica because
the oldest still-needed archives are in the historical timeline.

Supporting logical decoding timeline following is a pre-requisite for
logical decoding on physical standby servers. It also makes it
possible to promote a replica with logical slots to a master and
replay from those slots, allowing logical decoding applications to
follow physical failover.

Logical slots cannot actually be created on a replica without use of
the low-level C slot management APIs so this is mostly foundation work
for subsequent changes to enable logical decoding on standbys.

This commit includes a module in src/test/modules with functions to
manipulate the slots (which is not otherwise possible in SQL code) in
order to enable testing, and a new test in src/test/recovery to ensure
that the behavior is as expected.

Note that an earlier version of logical decoding timeline following
was committed to 9.5 as 24c5f1a103ce, 3a3b309041b0, 82c83b337202, and
f07d18b6e94d. It was then reverted by c1543a81a7a8 just after 9.5
feature freeze when issues were discovered too late to safely fix them
in the 9.5 release cycle.

The prior approach failed to consider that a record could be split
across pages that are on different segments, where the new segment
contains the start of a new timeline. In that case the old segment
might be missing or renamed with a .partial suffix.

This patch reworks the logic to be page-based and in the process
simplify how the last