On 2016-10-24 17:49:13 +0200, Petr Jelinek wrote: > + * Determine which timeline to read an xlog page from and set the > + * XLogReaderState's state->currTLI to that timeline ID.
"XLogReaderState's state->currTLI" - the state's a bit redundant. > + * The caller must also make sure it doesn't read past the current redo > pointer > + * so it doesn't fail to notice that the current timeline became historical. > + */ Not sure what that means? The redo pointer usually menas the "logical start" of the last checkpoint, but I don't see how thta could be meant here? > +static void > +XLogReadDetermineTimeline(XLogReaderState *state, XLogRecPtr wantPage, > uint32 wantLength) > +{ > + const XLogRecPtr lastReadPage = state->readSegNo * XLogSegSize + > state->readOff; > + > + elog(DEBUG4, "Determining timeline for read at %X/%X+%X", > + (uint32)(wantPage>>32), (uint32)wantPage, wantLength); Doing this on every single read from a page seems quite verbose. It's also (like most or all the following debug elogs) violating the casing prescribed by the message style guidelines. > + /* > + * If the desired page is currently read in and valid, we have nothing > to do. > + * > + * The caller should've ensured that it didn't previously advance > readOff > + * past the valid limit of this timeline, so it doesn't matter if the > current > + * TLI has since become historical. > + */ > + if (lastReadPage == wantPage && > + state->readLen != 0 && > + lastReadPage + state->readLen >= wantPage + > Min(wantLength,XLOG_BLCKSZ-1)) > + { > + elog(DEBUG4, "Wanted data already valid"); //XXX > + return; > + } With that kind of comment/XXX present, this surely can't be ready for committer? > + /* > + * If we're reading from the current timeline, it hasn't become > historical > + * and the page we're reading is after the last page read, we can again > + * just carry on. (Seeking backwards requires a check to make sure the > older > + * page isn't on a prior timeline). > + */ How can ThisTimeLineID become historical? > + if (state->currTLI == ThisTimeLineID && wantPage >= lastReadPage) > + { > + Assert(state->currTLIValidUntil == InvalidXLogRecPtr); > + elog(DEBUG4, "On current timeline"); > + return; > + } Also, is it actually ok to rely on ThisTimeLineID here? That's IIRC not maintained outside of the startup process, until recovery ended (cf it being set in InitXLOGAccess() called via RecoveryInProgress()). > /* > - * TODO: we're going to have to do something more intelligent > about > - * timelines on standbys. Use readTimeLineHistory() and > - * tliOfPointInHistory() to get the proper LSN? For now we'll > catch > - * that case earlier, but the code and TODO is left in here for > when > - * that changes. > + * Check which timeline to get the record from. > + * > + * We have to do it each time through the loop because if we're > in > + * recovery as a cascading standby, the current timeline > might've > + * become historical. > */ I guess you mean cascading as "replicating logically from a physical standby"? I don't think it's good to use cascading here, because that's usually thought to mean doing physical SR from a standby... > diff --git a/src/backend/replication/logical/logicalfuncs.c > b/src/backend/replication/logical/logicalfuncs.c > index 318726e..4315fb3 100644 > --- a/src/backend/replication/logical/logicalfuncs.c > +++ b/src/backend/replication/logical/logicalfuncs.c > @@ -234,12 +234,6 @@ pg_logical_slot_get_changes_guts(FunctionCallInfo > fcinfo, bool confirm, bool bin > rsinfo->setResult = p->tupstore; > rsinfo->setDesc = p->tupdesc; > > - /* compute the current end-of-wal */ > - if (!RecoveryInProgress()) > - end_of_wal = GetFlushRecPtr(); > - else > - end_of_wal = GetXLogReplayRecPtr(NULL); > - > ReplicationSlotAcquire(NameStr(*name)); > > PG_TRY(); > @@ -279,6 +273,12 @@ pg_logical_slot_get_changes_guts(FunctionCallInfo > fcinfo, bool confirm, bool bin > /* invalidate non-timetravel entries */ > InvalidateSystemCaches(); > > + if (!RecoveryInProgress()) > + end_of_wal = GetFlushRecPtr(); > + else > + end_of_wal = GetXLogReplayRecPtr(NULL); > + > + /* Decode until we run out of records */ > while ((startptr != InvalidXLogRecPtr && startptr < end_of_wal) > || > (ctx->reader->EndRecPtr != InvalidXLogRecPtr && > ctx->reader->EndRecPtr < end_of_wal)) > { That seems like a pretty random change? > diff --git a/src/include/access/xlogreader.h b/src/include/access/xlogreader.h > index deaa7f5..caff9a6 100644 > --- a/src/include/access/xlogreader.h > +++ b/src/include/access/xlogreader.h > @@ -27,6 +27,10 @@ > > #include "access/xlogrecord.h" > > +#ifndef FRONTEND > +#include "nodes/pg_list.h" > +#endif Why is that needed? > diff --git a/src/test/modules/test_slot_timelines/README > b/src/test/modules/test_slot_timelines/README > new file mode 100644 > index 0000000..585f02f > --- /dev/null > +++ b/src/test/modules/test_slot_timelines/README > @@ -0,0 +1,19 @@ > +A test module for logical decoding failover and timeline following. > + > +This module provides a minimal way to maintain logical slots on replicas > +that mirror the state on the master. It doesn't make decoding possible, > +just tracking slot state so that a decoding client that's using the master > +can follow a physical failover to the standby. The master doesn't know > +about the slots on the standby, they're synced by a client that connects > +to both. > + > +This is intentionally not part of the test_decoding module because that's > meant > +to serve as example code, where this module exercises internal server > features > +by unsafely exposing internal state to SQL. It's not the right way to do > +failover, it's just a simple way to test it from the perl TAP framework to > +prove the feature works. > + > +In a practical implementation of this approach a bgworker on the master would > +monitor slot positions and relay them to a bgworker on the standby that > applies > +the position updates without exposing slot internals to SQL. That's too > complex > +for this test framework though. I still don't want to have this code in postgres, it's just an example for doing something bad. Lets get proper feedback control in, and go from there. > diff --git > a/src/test/modules/test_slot_timelines/expected/load_extension_1.out > b/src/test/modules/test_slot_timelines/expected/load_extension_1.out > new file mode 100644 > index 0000000..0db21e4 > --- /dev/null > +++ b/src/test/modules/test_slot_timelines/expected/load_extension_1.out > @@ -0,0 +1,7 @@ > +CREATE EXTENSION test_slot_timelines; > +SELECT test_slot_timelines_create_logical_slot('test_slot', 'test_decoding'); > +ERROR: replication slots can only be used if max_replication_slots > 0 > +SELECT test_slot_timelines_advance_logical_slot('test_slot', > txid_current()::text::xid, txid_current()::text::xid, > pg_current_xlog_location(), pg_current_xlog_location()); > +ERROR: replication slots can only be used if max_replication_slots > 0 > +SELECT pg_drop_replication_slot('test_slot'); > +ERROR: replication slots can only be used if max_replication_slots > 0 > diff --git a/src/test/modules/test_slot_timelines/sql/load_extension.sql > b/src/test/modules/test_slot_timelines/sql/load_extension.sql > new file mode 100644 > index 0000000..2440355 It doesn't seem worthwhlie to maintain a test for this - why do we run the tests with those settings in the first place? Greetings, Andres Freund -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers