Re: [HACKERS] Logical decoding slots can go backwards when used from SQL, docs are wrong

2016-09-05 Thread Craig Ringer
On 5 September 2016 at 16:33, Petr Jelinek  wrote:

>> The better alternative is to add a variant on
>> pg_logical_slot_get_changes(...) etc that accepts a start LSN. But
>> it's not convenient or easy for SQL callers to extract the last commit
>> LSN received from the last call to pass it to the next one, so this
>> isn't simple, and is probably best tackled as part of the SQL
>> interface API change Petr and Andres discussed in this thread.
>>
>
> It isn't? I thought lsn is first column of the output of that function?

It is. While it's a pain to use that from SQL (psql, etc) as well as
doing something with the results, there's no point since anything
working in simple SQL will get terminated when the server restarts
anyway. So really my point above is moot.

That's doesn't reflect on the slot dirty patch, it just means there's
no need to do anything extra when we add the cursor-like interface
later in order to fully solve this.

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Logical decoding slots can go backwards when used from SQL, docs are wrong

2016-09-05 Thread Simon Riggs
On 5 September 2016 at 03:41, Craig Ringer  wrote:
> On 2 September 2016 at 17:49, Craig Ringer  wrote:
>
>> So the main change becomes the one-liner in my prior mail.
>
> Per feedback from Simon, updated with a new test in src/test/recovery .
>
> If you revert the change to
> src/backend/replication/logical/logicalfuncs.c then the test will
> start failing.
>
> I'd quite like to backpatch the fix since it's simple and safe, but I
> don't feel that it's hugely important to do so. This is an annoyance
> not a serious data risking bug.

I've committed to HEAD only. We can discuss backpatching elsewhere also.

-- 
Simon Riggshttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Logical decoding slots can go backwards when used from SQL, docs are wrong

2016-09-05 Thread Petr Jelinek

On 05/09/16 04:41, Craig Ringer wrote:

On 2 September 2016 at 17:49, Craig Ringer  wrote:


So the main change becomes the one-liner in my prior mail.


Per feedback from Simon, updated with a new test in src/test/recovery .

If you revert the change to
src/backend/replication/logical/logicalfuncs.c then the test will
start failing.

I'd quite like to backpatch the fix since it's simple and safe, but I
don't feel that it's hugely important to do so. This is an annoyance
not a serious data risking bug.

My only concern here is that we still lose position on crash. So
SQL-interface callers should still be able to cope with it. But they
won't see it happen if they're only testing with normal shutdowns,
host reboots, etc. In practice users aren't likely to notice this
anyway, though, since most people don't restart the server all the
time. I think it's better than what we have.

This issue could be eliminated completely by calling
ReplicationSlotSave() after ReplicationSlotMarkDirty(). This would
force an immediate flush after a non-peeked SQL interface decoding
call. It means more fsync()ing, though, and the SQL interface can only
be used by polling so that could be a significant performance hit.
We'd want to only flush when the position changed, but even then it'll
mean a sync every time anything gets returned.

The better alternative is to add a variant on
pg_logical_slot_get_changes(...) etc that accepts a start LSN. But
it's not convenient or easy for SQL callers to extract the last commit
LSN received from the last call to pass it to the next one, so this
isn't simple, and is probably best tackled as part of the SQL
interface API change Petr and Andres discussed in this thread.



It isn't? I thought lsn is first column of the output of that function?

--
  Petr Jelinek  http://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training & Services


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Logical decoding slots can go backwards when used from SQL, docs are wrong

2016-09-04 Thread Craig Ringer
On 5 September 2016 at 10:41, Craig Ringer  wrote:
> On 2 September 2016 at 17:49, Craig Ringer  wrote:
>
>> So the main change becomes the one-liner in my prior mail.
>
> Per feedback from Simon, updated with a new test in src/test/recovery .

... attached this time.

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services
From eb7e141195ddecd2e58b52167f83d83130909201 Mon Sep 17 00:00:00 2001
From: Craig Ringer 
Date: Mon, 2 May 2016 19:50:22 +0800
Subject: [PATCH] Dirty replication slots when accessed via sql interface

When pg_logical_slot_get_changes(...) sets confirmed_flush_lsn to the point at
which replay stopped, it doesn't dirty the replication slot.  So if the replay
didn't cause restart_lsn or catalog_xmin to change as well, this change will
not get written out to disk. Even on a clean shutdown.

If Pg crashes or restarts, a subsequent pg_logical_slot_get_changes(...) call
will see the same changes already replayed since it uses the slot's
confirmed_flush_lsn as the start point for fetching changes. The caller can't
specify a start LSN when using the SQL interface.

Mark the slot as dirty after reading changes using the SQL interface so that
users won't see repeated changes after a clean shutdown. Repeated changes still
occur when using the walsender interface or after an unclean shutdown.
---
 src/backend/replication/logical/logicalfuncs.c | 15 ++
 src/test/recovery/Makefile |  2 ++
 src/test/recovery/t/006_logical_decoding.pl| 40 ++
 3 files changed, 57 insertions(+)
 create mode 100644 src/test/recovery/t/006_logical_decoding.pl

diff --git a/src/backend/replication/logical/logicalfuncs.c b/src/backend/replication/logical/logicalfuncs.c
index 4e4c8cd..9c7be2d 100644
--- a/src/backend/replication/logical/logicalfuncs.c
+++ b/src/backend/replication/logical/logicalfuncs.c
@@ -321,7 +321,22 @@ pg_logical_slot_get_changes_guts(FunctionCallInfo fcinfo, bool confirm, bool bin
 		 * business..)
 		 */
 		if (ctx->reader->EndRecPtr != InvalidXLogRecPtr && confirm)
+		{
 			LogicalConfirmReceivedLocation(ctx->reader->EndRecPtr);
+			/*
+			 * If only the confirmed_flush_lsn has changed the slot won't get
+			 * marked as dirty by the above. Callers on the walsender interface
+			 * are expected to keep track of their own progress and don't need
+			 * it written out. But SQL-interface users cannot specify their own
+			 * start positions and it's harder for them to keep track of their
+			 * progress, so we should make more of an effort to save it for them.
+			 *
+			 * Dirty the slot so it's written out at the next checkpoint. We'll
+			 * still lose its position on crash, as documented, but it's better
+			 * than always losing the position even on clean restart.
+			 */
+			ReplicationSlotMarkDirty();
+		}
 
 		/* free context, call shutdown callback */
 		FreeDecodingContext(ctx);
diff --git a/src/test/recovery/Makefile b/src/test/recovery/Makefile
index 9290719..a847952 100644
--- a/src/test/recovery/Makefile
+++ b/src/test/recovery/Makefile
@@ -18,3 +18,5 @@ check:
 
 clean distclean maintainer-clean:
 	rm -rf tmp_check
+
+EXTRA_INSTALL = contrib/test_decoding
diff --git a/src/test/recovery/t/006_logical_decoding.pl b/src/test/recovery/t/006_logical_decoding.pl
new file mode 100644
index 000..4a782b8
--- /dev/null
+++ b/src/test/recovery/t/006_logical_decoding.pl
@@ -0,0 +1,40 @@
+# Testing of logical decoding using SQL interface and/or pg_recvlogical
+use strict;
+use warnings;
+use PostgresNode;
+use TestLib;
+use Test::More tests => 2;
+
+# Initialize master node
+my $node_master = get_new_node('master');
+$node_master->init(allows_streaming => 1);
+$node_master->append_conf(
+		'postgresql.conf', qq(
+max_replication_slots = 4
+wal_level = logical
+));
+$node_master->start;
+my $backup_name = 'master_backup';
+
+$node_master->safe_psql('postgres', qq[CREATE TABLE decoding_test(x integer, y text);]);
+
+$node_master->safe_psql('postgres', qq[SELECT pg_create_logical_replication_slot('test_slot', 'test_decoding');]);
+
+$node_master->safe_psql('postgres', qq[INSERT INTO decoding_test(x,y) SELECT s, s::text FROM generate_series(1,10) s;]);
+
+# Basic decoding works
+my($result) = $node_master->safe_psql('postgres', qq[SELECT pg_logical_slot_get_changes('test_slot', NULL, NULL);]);
+is(scalar(split /^/m, $result), 12, 'Decoding produced 12 rows inc BEGIN/COMMIT');
+
+# If we immediately crash the server we might lose the progress we just made
+# and replay the same changes again. But a clean shutdown should never repeat
+# the same changes when we use the SQL decoding interface.
+$node_master->restart('fast');
+
+# There are no new writes, so the result should be empty.
+$result = $node_master->safe_psql('postgres', qq[SELECT pg_logical_slot_get_changes('test_slot', NULL, NULL);]);
+chomp($result);

Re: [HACKERS] Logical decoding slots can go backwards when used from SQL, docs are wrong

2016-09-04 Thread Craig Ringer
On 2 September 2016 at 17:49, Craig Ringer  wrote:

> So the main change becomes the one-liner in my prior mail.

Per feedback from Simon, updated with a new test in src/test/recovery .

If you revert the change to
src/backend/replication/logical/logicalfuncs.c then the test will
start failing.

I'd quite like to backpatch the fix since it's simple and safe, but I
don't feel that it's hugely important to do so. This is an annoyance
not a serious data risking bug.

My only concern here is that we still lose position on crash. So
SQL-interface callers should still be able to cope with it. But they
won't see it happen if they're only testing with normal shutdowns,
host reboots, etc. In practice users aren't likely to notice this
anyway, though, since most people don't restart the server all the
time. I think it's better than what we have.

This issue could be eliminated completely by calling
ReplicationSlotSave() after ReplicationSlotMarkDirty(). This would
force an immediate flush after a non-peeked SQL interface decoding
call. It means more fsync()ing, though, and the SQL interface can only
be used by polling so that could be a significant performance hit.
We'd want to only flush when the position changed, but even then it'll
mean a sync every time anything gets returned.

The better alternative is to add a variant on
pg_logical_slot_get_changes(...) etc that accepts a start LSN. But
it's not convenient or easy for SQL callers to extract the last commit
LSN received from the last call to pass it to the next one, so this
isn't simple, and is probably best tackled as part of the SQL
interface API change Petr and Andres discussed in this thread.

I'm inclined to just dirty the slot for now, then provide a better
solution by adding the peek/confirm interface discussed upthread down
the track.

Andres, Petr, got an opinion here?

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Logical decoding slots can go backwards when used from SQL, docs are wrong

2016-09-02 Thread Andres Freund
On 2016-09-02 10:50:17 +0200, Petr Jelinek wrote:
> Okay that sounds reasonable, the SQL interface is already somewhat different
> than walsender as it does not really "stream" so makes sense to improve the
> behavior there. As a side note, I would really like to have cursor-like SQL
> interface which behaves more like walsender one but that's very different
> patch.

FWIW, with the changes in 
https://www.postgresql.org/message-id/20160827214829.zo2dfb5jaikii...@alap3.anarazel.de
thats starting to be realistic for a function based interface. By
switching things over to ValuePerCall mode you could actually use a
cursor, without pre-materializing all changes.

Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Logical decoding slots can go backwards when used from SQL, docs are wrong

2016-09-02 Thread Craig Ringer
On 2 September 2016 at 16:50, Petr Jelinek  wrote:

>> So here's a simpler patch that just dirties the slot when it's
>> replayed something from it on the SQL interface [...]
>
> Okay that sounds reasonable, the SQL interface is already somewhat different
> than walsender as it does not really "stream" so makes sense to improve the
> behavior there.

Yep. For the walsender interface it's reasonable to expect the
downstream to keep track of its progress - even if right now
pg_recvlogical doesn't do so very well. It's not practical for the SQL
interface.

> As a side note, I would really like to have cursor-like SQL
> interface which behaves more like walsender one but that's very different
> patch.

Me too. Specifically, I'd really like to be able to "peek, confirm,
peek, confirm" and also be able to specify my own start lsn for peek.
But as you say, that's a different patch.

>> The alternative is probably to add a second, "softer" dirty tracking
>> method that only causes a write at a clean shutdown or forced
>> checkpoint - and maybe doesn't bother fsync()ing. That's a bit more
>> invasive but would work for walsender use as well as the SQL
>> interface. I don't think it's worth the bother, since in the end
>> callers have to be prepared for repeated data on crash anyway.
>>
>
> Correct me if I am wrong but I think the only situation where it would
> matter is on server that restarts often or crashes often (as the logical
> decoding then has to do the work many times) but I don't think it's worth
> optimizing for that kind of scenario.

Right. Though it's not so much doing the work many times that's the
concern - this change doesn't affect restart_lsn at all. It's that it
sends already-seen-and-confirmed changes to the output plugin and
therefore the client.

Not that big a deal. As far as I'm concerned, it's the job of users of
the walsender interface to keep track of the position they've consumed
up to and specify it to subsequent START_REPLICATION calls. You can't
do that on the SQL interface, which is why this change is needed
there.

Maybe the docs should be more explicit about the reason to specify
start lsn to START_REPLICATION not just 0/0 though, and also that we
will disregard it if it's less than the stored confirmed_flush_lsn.
Again, separate patch.

So the main change becomes the one-liner in my prior mail.

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Logical decoding slots can go backwards when used from SQL, docs are wrong

2016-09-02 Thread Petr Jelinek

On 02/09/16 09:58, Craig Ringer wrote:

On 1 September 2016 at 21:19, Simon Riggs  wrote:


I agree the doc patch should go in, though I suggest reword it
slightly, like attached patch.


Thanks. The rewording looks good to me and I think that
Doc-correction-each-change-once.v2.patch is ready for commit.

Meanwhile, thinking about the patch to dirty slots on
confirmed_flush_lsn advance some more, I don't think it's ideal to do
it in LogicalConfirmReceivedLocation(). I'd rather not add more
complexity there, it's already complex enough. Doing it there will
also lead to unnecessary slot write-out being done to slots at normal
checkpoints even if the slot hasn't otherwise changed, possibly in
response to lsn advances sent in response to keepalives due to
activity on other unrelated databases. Slots are always fsync()ed when
written out so we don't want to do it more than we have to.

We really only need to write out slots where only the
confirmed_flush_lsn has changed at a shutdown checkpoint since it's
not really a big worry if it goes backwards on crash, and otherwise it
can't. Even then it only _really_ matters when the SQL interface is
used. Losing the confirmed_flush_lsn is very annoying when using
pg_recvlogical too, and was the original motivation for this patch.
But I'm thinking of instead teaching pg_recvlogical to write out a
status file with its last confirmed point on exit and to be able to
take that as an argument when (re)connecting. Poor-man's replication
origins, effectively.

So here's a simpler patch that just dirties the slot when it's
replayed something from it on the SQL interface, so it's flushed out
next checkpoint or on shutdown. That's the main user visible defect
that should be fixed and it's trivial to do here. It means we'll still
forget the confirmed_flush_lsn on clean shutdown if it was advanced
using the walsender protocol, but *shrug*. That's just a little
inconvenient. I can patch pg_recvlogical separately.


Okay that sounds reasonable, the SQL interface is already somewhat 
different than walsender as it does not really "stream" so makes sense 
to improve the behavior there. As a side note, I would really like to 
have cursor-like SQL interface which behaves more like walsender one but 
that's very different patch.




The alternative is probably to add a second, "softer" dirty tracking
method that only causes a write at a clean shutdown or forced
checkpoint - and maybe doesn't bother fsync()ing. That's a bit more
invasive but would work for walsender use as well as the SQL
interface. I don't think it's worth the bother, since in the end
callers have to be prepared for repeated data on crash anyway.



Correct me if I am wrong but I think the only situation where it would 
matter is on server that restarts often or crashes often (as the logical 
decoding then has to do the work many times) but I don't think it's 
worth optimizing for that kind of scenario.


--
  Petr Jelinek  http://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training & Services


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Logical decoding slots can go backwards when used from SQL, docs are wrong

2016-09-02 Thread Craig Ringer
On 1 September 2016 at 21:19, Simon Riggs  wrote:

> I agree the doc patch should go in, though I suggest reword it
> slightly, like attached patch.

Thanks. The rewording looks good to me and I think that
Doc-correction-each-change-once.v2.patch is ready for commit.

Meanwhile, thinking about the patch to dirty slots on
confirmed_flush_lsn advance some more, I don't think it's ideal to do
it in LogicalConfirmReceivedLocation(). I'd rather not add more
complexity there, it's already complex enough. Doing it there will
also lead to unnecessary slot write-out being done to slots at normal
checkpoints even if the slot hasn't otherwise changed, possibly in
response to lsn advances sent in response to keepalives due to
activity on other unrelated databases. Slots are always fsync()ed when
written out so we don't want to do it more than we have to.

We really only need to write out slots where only the
confirmed_flush_lsn has changed at a shutdown checkpoint since it's
not really a big worry if it goes backwards on crash, and otherwise it
can't. Even then it only _really_ matters when the SQL interface is
used. Losing the confirmed_flush_lsn is very annoying when using
pg_recvlogical too, and was the original motivation for this patch.
But I'm thinking of instead teaching pg_recvlogical to write out a
status file with its last confirmed point on exit and to be able to
take that as an argument when (re)connecting. Poor-man's replication
origins, effectively.

So here's a simpler patch that just dirties the slot when it's
replayed something from it on the SQL interface, so it's flushed out
next checkpoint or on shutdown. That's the main user visible defect
that should be fixed and it's trivial to do here. It means we'll still
forget the confirmed_flush_lsn on clean shutdown if it was advanced
using the walsender protocol, but *shrug*. That's just a little
inconvenient. I can patch pg_recvlogical separately.

The alternative is probably to add a second, "softer" dirty tracking
method that only causes a write at a clean shutdown or forced
checkpoint - and maybe doesn't bother fsync()ing. That's a bit more
invasive but would work for walsender use as well as the SQL
interface. I don't think it's worth the bother, since in the end
callers have to be prepared for repeated data on crash anyway.

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services
From 7261d4e182a011c8bb609f9307dcc61c5e15dd56 Mon Sep 17 00:00:00 2001
From: Craig Ringer 
Date: Mon, 2 May 2016 19:50:22 +0800
Subject: [PATCH] Dirty replication slots when accessed via sql interface

When pg_logical_slot_get_changes(...) sets confirmed_flush_lsn to the point at
which replay stopped, it doesn't dirty the replication slot.  So if the replay
didn't cause restart_lsn or catalog_xmin to change as well, this change will
not get written out to disk. Even on a clean shutdown.

If Pg crashes or restarts, a subsequent pg_logical_slot_get_changes(...) call
will see the same changes already replayed since it uses the slot's
confirmed_flush_lsn as the start point for fetching changes. The caller can't
specify a start LSN when using the SQL interface.

Mark the slot as dirty after reading changes using the SQL interface so that
users won't see repeated changes after a clean shutdown. Repeated changes still
occur when using the walsender interface or after an unclean shutdown.
---
 src/backend/replication/logical/logicalfuncs.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/src/backend/replication/logical/logicalfuncs.c b/src/backend/replication/logical/logicalfuncs.c
index 4e4c8cd..3d1de03 100644
--- a/src/backend/replication/logical/logicalfuncs.c
+++ b/src/backend/replication/logical/logicalfuncs.c
@@ -321,7 +321,10 @@ pg_logical_slot_get_changes_guts(FunctionCallInfo fcinfo, bool confirm, bool bin
 		 * business..)
 		 */
 		if (ctx->reader->EndRecPtr != InvalidXLogRecPtr && confirm)
+		{
 			LogicalConfirmReceivedLocation(ctx->reader->EndRecPtr);
+			ReplicationSlotMarkDirty();
+		}
 
 		/* free context, call shutdown callback */
 		FreeDecodingContext(ctx);
-- 
2.5.5


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Logical decoding slots can go backwards when used from SQL, docs are wrong

2016-09-01 Thread Simon Riggs
On 20 August 2016 at 15:04, Petr Jelinek  wrote:
> On 20/08/16 16:01, Craig Ringer wrote:
>>
>> On 5 June 2016 at 09:54, David G. Johnston > > wrote:
>>
>> On Thursday, March 17, 2016, Craig Ringer > > wrote:
>>
>> The first patch was incorrectly created on top of failover slots
>> not HEAD. Attached patch applies on HEAD.
>>
>>
>> Lots of logical decoding work ongoing but this one shows as active
>> in the September cf and the comments by Craig indicate potential
>> relevance to 9.6.  Is this still live as-is or has it been subsumed
>> by other threads?
>>
>>
>>
>> Yes. Both those patches are still pending and should be considered for
>> commit in the next CF. (Of course, I think they should just be
>> committed, but I would, wouldn't I?)
>>
>>
>
> I think the doc one should definitely go in and possibly be back-patched all
> the way to 9.4. I didn't look at the other one.

I agree the doc patch should go in, though I suggest reword it
slightly, like attached patch.

-- 
Simon Riggshttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Doc-correction-each-change-once.v2.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Logical decoding slots can go backwards when used from SQL, docs are wrong

2016-08-20 Thread Petr Jelinek

On 20/08/16 16:01, Craig Ringer wrote:

On 5 June 2016 at 09:54, David G. Johnston > wrote:

On Thursday, March 17, 2016, Craig Ringer > wrote:

The first patch was incorrectly created on top of failover slots
not HEAD. Attached patch applies on HEAD.


Lots of logical decoding work ongoing but this one shows as active
in the September cf and the comments by Craig indicate potential
relevance to 9.6.  Is this still live as-is or has it been subsumed
by other threads?



Yes. Both those patches are still pending and should be considered for
commit in the next CF. (Of course, I think they should just be
committed, but I would, wouldn't I?)




I think the doc one should definitely go in and possibly be back-patched 
all the way to 9.4. I didn't look at the other one.


--
  Petr Jelinek  http://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training & Services


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Logical decoding slots can go backwards when used from SQL, docs are wrong

2016-08-20 Thread Craig Ringer
On 5 June 2016 at 09:54, David G. Johnston 
wrote:

> On Thursday, March 17, 2016, Craig Ringer  wrote:
>
>> The first patch was incorrectly created on top of failover slots not
>> HEAD. Attached patch applies on HEAD.
>>
>
> Lots of logical decoding work ongoing but this one shows as active in the
> September cf and the comments by Craig indicate potential relevance to
> 9.6.  Is this still live as-is or has it been subsumed by other threads?
>


Yes. Both those patches are still pending and should be considered for
commit in the next CF. (Of course, I think they should just be committed,
but I would, wouldn't I?)


-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


Re: [HACKERS] Logical decoding slots can go backwards when used from SQL, docs are wrong

2016-06-04 Thread David G. Johnston
On Thursday, March 17, 2016, Craig Ringer  wrote:

> The first patch was incorrectly created on top of failover slots not HEAD.
> Attached patch applies on HEAD.
>

Lots of logical decoding work ongoing but this one shows as active in the
September cf and the comments by Craig indicate potential relevance to
9.6.  Is this still live as-is or has it been subsumed by other threads?

David J.


Re: [HACKERS] Logical decoding slots can go backwards when used from SQL, docs are wrong

2016-03-18 Thread Craig Ringer
The first patch was incorrectly created on top of failover slots not HEAD.
Attached patch applies on HEAD.
From 87d839f8a2e78abb17fa985502fd5b66f0872b57 Mon Sep 17 00:00:00 2001
From: Craig Ringer 
Date: Wed, 16 Mar 2016 15:45:16 +0800
Subject: [PATCH 1/2] Correct incorrect claim that slots output a change
 "exactly once"

Replication slots may actually emit a change more than once
if the master crashes before flushing the slot.

See
http://www.postgresql.org/message-id/camsr+ygsatrgqpcx9qx4eocizwsa27xjkeipsotaje8ofix...@mail.gmail.com
for details.
---
 doc/src/sgml/logicaldecoding.sgml | 16 
 1 file changed, 12 insertions(+), 4 deletions(-)

diff --git a/doc/src/sgml/logicaldecoding.sgml b/doc/src/sgml/logicaldecoding.sgml
index e841348..78e3dba 100644
--- a/doc/src/sgml/logicaldecoding.sgml
+++ b/doc/src/sgml/logicaldecoding.sgml
@@ -12,7 +12,6 @@
 
   
Changes are sent out in streams identified by logical replication slots.
-   Each stream outputs each change exactly once.
   
 
   
@@ -204,8 +203,7 @@ $ pg_recvlogical -d postgres --slot test --drop-slot
  In the context of logical replication, a slot represents a stream of
  changes that can be replayed to a client in the order they were made on
  the origin server. Each slot streams a sequence of changes from a single
- database, sending each change exactly once (except when peeking forward
- in the stream).
+ database.
 
 
 
@@ -218,7 +216,17 @@ $ pg_recvlogical -d postgres --slot test --drop-slot
 
  A replication slot has an identifier that is unique across all databases
  in a PostgreSQL cluster. Slots persist
- independently of the connection using them and are crash-safe.
+ independently of the connection using them. Slot creation and drop is
+ crash-safe, and slots will never be corrupted by a crash.
+
+
+
+ A logical slot outputs each database change at least once. A slot will
+ usually only emit a change once, but recently-sent changes may be sent
+ again if the server server crashes and restarts. Clients should remember
+ the last LSN they saw when decoding and skip over any repeated data or
+ (when using the replication protocol) request that decoding start from
+ that LSN rather than letting the server determine the start point.
 
 
 
-- 
2.1.0

From fdfe91482d7dd28920db67067d77388ef3871165 Mon Sep 17 00:00:00 2001
From: Craig Ringer 
Date: Wed, 16 Mar 2016 15:12:34 +0800
Subject: [PATCH 2/2] Dirty replication slots when confirm_lsn is changed

---
 src/backend/replication/logical/logical.c | 62 +--
 1 file changed, 42 insertions(+), 20 deletions(-)

diff --git a/src/backend/replication/logical/logical.c b/src/backend/replication/logical/logical.c
index 2e6d3f9..40db6ff 100644
--- a/src/backend/replication/logical/logical.c
+++ b/src/backend/replication/logical/logical.c
@@ -437,6 +437,7 @@ DecodingContextFindStartpoint(LogicalDecodingContext *ctx)
 	}
 
 	ctx->slot->data.confirmed_flush = ctx->reader->EndRecPtr;
+	ReplicationSlotMarkDirty();
 }
 
 /*
@@ -847,10 +848,15 @@ LogicalConfirmReceivedLocation(XLogRecPtr lsn)
 	{
 		bool		updated_xmin = false;
 		bool		updated_restart = false;
+		bool		updated_confirm = false;
 
 		SpinLockAcquire(>mutex);
 
-		MyReplicationSlot->data.confirmed_flush = lsn;
+		if (MyReplicationSlot->data.confirmed_flush != lsn)
+		{
+			MyReplicationSlot->data.confirmed_flush = lsn;
+			updated_confirm = true;
+		}
 
 		/* if were past the location required for bumping xmin, do so */
 		if (MyReplicationSlot->candidate_xmin_lsn != InvalidXLogRecPtr &&
@@ -888,34 +894,50 @@ LogicalConfirmReceivedLocation(XLogRecPtr lsn)
 
 		SpinLockRelease(>mutex);
 
-		/* first write new xmin to disk, so we know whats up after a crash */
-		if (updated_xmin || updated_restart)
+		if (updated_xmin || updated_restart || updated_confirm)
 		{
 			ReplicationSlotMarkDirty();
-			ReplicationSlotSave();
-			elog(DEBUG1, "updated xmin: %u restart: %u", updated_xmin, updated_restart);
-		}
 
-		/*
-		 * Now the new xmin is safely on disk, we can let the global value
-		 * advance. We do not take ProcArrayLock or similar since we only
-		 * advance xmin here and there's not much harm done by a concurrent
-		 * computation missing that.
-		 */
-		if (updated_xmin)
-		{
-			SpinLockAcquire(>mutex);
-			MyReplicationSlot->effective_catalog_xmin = MyReplicationSlot->data.catalog_xmin;
-			SpinLockRelease(>mutex);
+			/*
+			 * first write new xmin to disk, so we know whats up
+			 * after a crash.
+			 */
+			if (updated_xmin || updated_restart)
+			{
+ReplicationSlotSave();
+elog(DEBUG1, "updated xmin: %u restart: %u", updated_xmin, updated_restart);
+			}
 
-			ReplicationSlotsComputeRequiredXmin(false);
-			ReplicationSlotsComputeRequiredLSN();
+			/*
+			 * Now the new xmin is safely on disk, we can let the global value
+			 * advance. We do 

Re: [HACKERS] Logical decoding slots can go backwards when used from SQL, docs are wrong

2016-03-16 Thread Craig Ringer
Attached is a patch to mark a logical replication slot as dirty if its
confirmed lsn is changed.

Aesthetically I'm not sure if it's better to do it per this patch and call
ReplicationSlotMarkDirty after we release the spinlock, add a new
ReplicationSlotMarkDirtyLocked() that skips the spinlock acquisition, or
add a bool is_locked arg to ReplicationSlotMarkDirty and update all call
sites. All will have the same result.

I've confirmed this works as expected as part of the failover slots test
suite but it'd be pretty seriously cumbersome to extract. If anyone feels
strongly about it I can add a test that shows that

- start master
- create slot
- do some stuff
- replay from slot
- fast-stop master
- start master
- replay from slot

doesn't see the same data a second time, but if we repeat it using
immediate stop it will see the same data when replaying again.

Also attached is another patch to amend the docs to reflect the fact that a
slot can actually replay the same change more than once. I avoided the
strong temptation to update other parts of the docs nearby.

Both these are fixes that should IMO be applied to 9.6.

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services
From e2bda9fa26f9d92059490d2b9ea7c37ae45f81b1 Mon Sep 17 00:00:00 2001
From: Craig Ringer 
Date: Wed, 16 Mar 2016 15:12:34 +0800
Subject: [PATCH] Dirty replication slots when confirm_lsn is changed

---
 src/backend/replication/logical/logical.c | 62 +--
 1 file changed, 42 insertions(+), 20 deletions(-)

diff --git a/src/backend/replication/logical/logical.c b/src/backend/replication/logical/logical.c
index 2c7b749..d3fb1a5 100644
--- a/src/backend/replication/logical/logical.c
+++ b/src/backend/replication/logical/logical.c
@@ -440,6 +440,7 @@ DecodingContextFindStartpoint(LogicalDecodingContext *ctx)
 	}
 
 	ctx->slot->data.confirmed_flush = ctx->reader->EndRecPtr;
+	ReplicationSlotMarkDirty();
 }
 
 /*
@@ -850,10 +851,15 @@ LogicalConfirmReceivedLocation(XLogRecPtr lsn)
 	{
 		bool		updated_xmin = false;
 		bool		updated_restart = false;
+		bool		updated_confirm = false;
 
 		SpinLockAcquire(>mutex);
 
-		MyReplicationSlot->data.confirmed_flush = lsn;
+		if (MyReplicationSlot->data.confirmed_flush != lsn)
+		{
+			MyReplicationSlot->data.confirmed_flush = lsn;
+			updated_confirm = true;
+		}
 
 		/* if were past the location required for bumping xmin, do so */
 		if (MyReplicationSlot->candidate_xmin_lsn != InvalidXLogRecPtr &&
@@ -891,34 +897,50 @@ LogicalConfirmReceivedLocation(XLogRecPtr lsn)
 
 		SpinLockRelease(>mutex);
 
-		/* first write new xmin to disk, so we know whats up after a crash */
-		if (updated_xmin || updated_restart)
+		if (updated_xmin || updated_restart || updated_confirm)
 		{
 			ReplicationSlotMarkDirty();
-			ReplicationSlotSave();
-			elog(DEBUG1, "updated xmin: %u restart: %u", updated_xmin, updated_restart);
-		}
 
-		/*
-		 * Now the new xmin is safely on disk, we can let the global value
-		 * advance. We do not take ProcArrayLock or similar since we only
-		 * advance xmin here and there's not much harm done by a concurrent
-		 * computation missing that.
-		 */
-		if (updated_xmin)
-		{
-			SpinLockAcquire(>mutex);
-			MyReplicationSlot->effective_catalog_xmin = MyReplicationSlot->data.catalog_xmin;
-			SpinLockRelease(>mutex);
+			/*
+			 * first write new xmin to disk, so we know whats up
+			 * after a crash.
+			 */
+			if (updated_xmin || updated_restart)
+			{
+ReplicationSlotSave();
+elog(DEBUG1, "updated xmin: %u restart: %u", updated_xmin, updated_restart);
+			}
 
-			ReplicationSlotsUpdateRequiredXmin(false);
-			ReplicationSlotsUpdateRequiredLSN();
+			/*
+			 * Now the new xmin is safely on disk, we can let the global value
+			 * advance. We do not take ProcArrayLock or similar since we only
+			 * advance xmin here and there's not much harm done by a concurrent
+			 * computation missing that.
+			 */
+			if (updated_xmin)
+			{
+SpinLockAcquire(>mutex);
+MyReplicationSlot->effective_catalog_xmin = MyReplicationSlot->data.catalog_xmin;
+SpinLockRelease(>mutex);
+
+ReplicationSlotsUpdateRequiredXmin(false);
+ReplicationSlotsUpdateRequiredLSN();
+			}
 		}
 	}
 	else
 	{
+		bool dirtied = false;
+
 		SpinLockAcquire(>mutex);
-		MyReplicationSlot->data.confirmed_flush = lsn;
+		if (MyReplicationSlot->data.confirmed_flush != lsn)
+		{
+			MyReplicationSlot->data.confirmed_flush = lsn;
+			dirtied = true;
+		}
 		SpinLockRelease(>mutex);
+
+		if (dirtied)
+			ReplicationSlotMarkDirty();
 	}
 }
-- 
2.1.0

From 1f8d9319ef9c934c414cebf1f5223c3b3023bf7f Mon Sep 17 00:00:00 2001
From: Craig Ringer 
Date: Wed, 16 Mar 2016 15:45:16 +0800
Subject: [PATCH 2/2] Correct incorrect claim that slots output a change
 "exactly once"

Replication slots may actually emit a change more than once
if the master crashes before flushing the 

Re: [HACKERS] Logical decoding slots can go backwards when used from SQL, docs are wrong

2016-03-14 Thread Petr Jelinek

On 14/03/16 10:48, Craig Ringer wrote:


You btw can emulate asking for the specific LSN in SQL interface by
first calling the pg_logical_slot_get_changes function with upto_lsn
set to whatever lsn you expect to start at, but it's ugly.


Ugh.

I didn't realise pg_logical_slot_get_changes could backtrack by setting
an upto_lsn in the past. That doesn't seem like something we should
really be doing - if it's a limit, and we're already past that limit, we
should just be returning the empty set.



Not past, future from the point of old confirmed_lsn at least. The point 
was that the next call will start from whatever lsn you specified as 
upto_lsn.


--
  Petr Jelinek  http://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training & Services


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Logical decoding slots can go backwards when used from SQL, docs are wrong

2016-03-14 Thread Craig Ringer
On 14 March 2016 at 17:16, Petr Jelinek  wrote:


> It will not change the fact that slot can go backwards


Sure. I don't consider that a problem in general though. It's similar to
the way we lose cached sequence chunks on crash - a performance
optimisation with a user visible effect. It needs to be documented
correctly, but isn't IMO a problem.

Failing to save the full slot state on clean shutdown is though, IMO. It
wouldn't matter if it was only used for sync rep, but since it's also used
to manage the resume point for SQL-level logical decoding calls we should
make a reasonable effort to preserve it ... and allow the user to handle
the situation correctly when we fail to preserve it.


> You btw can emulate asking for the specific LSN in SQL interface by first
> calling the pg_logical_slot_get_changes function with upto_lsn set to
> whatever lsn you expect to start at, but it's ugly.


Ugh.

I didn't realise pg_logical_slot_get_changes could backtrack by setting an
upto_lsn in the past. That doesn't seem like something we should really be
doing - if it's a limit, and we're already past that limit, we should just
be returning the empty set.

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


Re: [HACKERS] Logical decoding slots can go backwards when used from SQL, docs are wrong

2016-03-14 Thread Petr Jelinek

On 14/03/16 08:08, Craig Ringer wrote:

On 11 March 2016 at 20:15, Alvaro Herrera > wrote:

Craig Ringer wrote:
> Hi all
>
> I think I found a couple of logical decoding issues while writing tests 
for
> failover slots.
>
> Despite the docs' claim that a logical slot will replay data "exactly
> once", a slot's confirmed_lsn can go backwards and the SQL functions can
> replay the same data more than once.We don't mark a slot as dirty if only
> its confirmed_lsn is advanced, so it isn't flushed to disk. For failover
> slots this means it also doesn't get replicated via WAL. After a master
> crash, or for failover slots after a promote event, the confirmed_lsn will
> go backwards.  Users of the SQL interface must keep track of the safely
> locally flushed slot position themselves and throw the repeated data away.
> Unlike with the walsender protocol it has no way to ask the server to skip
> that data.
>
> Worse, because we don't dirty the slot even a *clean shutdown* causes slot
> confirmed_lsn to go backwards. That's a bug IMO. We should force a flush 
of
> all slots at the shutdown checkpoint, whether dirty or not, to address it.

Why don't we mark the slot dirty when confirmed_lsn advances?  If we fix
that, doesn't it fix the other problems too?


Yes, it does.



It will not change the fact that slot can go backwards however even in 
clean shutdown of the server as in walsender the confirmed_lsn only 
changes after feedback message so if client crashes it won't get updated 
(for obvious reasons).


You btw can emulate asking for the specific LSN in SQL interface by 
first calling the pg_logical_slot_get_changes function with upto_lsn set 
to whatever lsn you expect to start at, but it's ugly.


--
  Petr Jelinek  http://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training & Services


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Logical decoding slots can go backwards when used from SQL, docs are wrong

2016-03-14 Thread Craig Ringer
On 11 March 2016 at 20:15, Alvaro Herrera  wrote:

> Craig Ringer wrote:
> > Hi all
> >
> > I think I found a couple of logical decoding issues while writing tests
> for
> > failover slots.
> >
> > Despite the docs' claim that a logical slot will replay data "exactly
> > once", a slot's confirmed_lsn can go backwards and the SQL functions can
> > replay the same data more than once.We don't mark a slot as dirty if only
> > its confirmed_lsn is advanced, so it isn't flushed to disk. For failover
> > slots this means it also doesn't get replicated via WAL. After a master
> > crash, or for failover slots after a promote event, the confirmed_lsn
> will
> > go backwards.  Users of the SQL interface must keep track of the safely
> > locally flushed slot position themselves and throw the repeated data
> away.
> > Unlike with the walsender protocol it has no way to ask the server to
> skip
> > that data.
> >
> > Worse, because we don't dirty the slot even a *clean shutdown* causes
> slot
> > confirmed_lsn to go backwards. That's a bug IMO. We should force a flush
> of
> > all slots at the shutdown checkpoint, whether dirty or not, to address
> it.
>
> Why don't we mark the slot dirty when confirmed_lsn advances?  If we fix
> that, doesn't it fix the other problems too?
>

Yes, it does.

That'll cause slots to be written out at checkpoints when they otherwise
wouldn't have to be, but I'd rather be doing a little more work in this
case. Compared to the disk activity from WAL decoding etc the effect should
be undetectable anyway.

Andres? Any objection to dirtying a slot when the confirmed lsn advances,
so we write it out at the next checkpoint?

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


Re: [HACKERS] Logical decoding slots can go backwards when used from SQL, docs are wrong

2016-03-11 Thread Alvaro Herrera
Craig Ringer wrote:
> Hi all
> 
> I think I found a couple of logical decoding issues while writing tests for
> failover slots.
> 
> Despite the docs' claim that a logical slot will replay data "exactly
> once", a slot's confirmed_lsn can go backwards and the SQL functions can
> replay the same data more than once.We don't mark a slot as dirty if only
> its confirmed_lsn is advanced, so it isn't flushed to disk. For failover
> slots this means it also doesn't get replicated via WAL. After a master
> crash, or for failover slots after a promote event, the confirmed_lsn will
> go backwards.  Users of the SQL interface must keep track of the safely
> locally flushed slot position themselves and throw the repeated data away.
> Unlike with the walsender protocol it has no way to ask the server to skip
> that data.
> 
> Worse, because we don't dirty the slot even a *clean shutdown* causes slot
> confirmed_lsn to go backwards. That's a bug IMO. We should force a flush of
> all slots at the shutdown checkpoint, whether dirty or not, to address it.

Why don't we mark the slot dirty when confirmed_lsn advances?  If we fix
that, doesn't it fix the other problems too?

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Logical decoding slots can go backwards when used from SQL, docs are wrong

2016-03-11 Thread Craig Ringer
On 11 March 2016 at 17:00, Simon Riggs  wrote:


>
>
>> Also, pg_logical_slot_get_changes and its _peek_ variant should have a
>> param specifying the starting LSN to read and return. If this is lower than
>> the restart_lsn but non-null it should ERROR; if it's greater than or equal
>> it should use this position instead of starting at the confirmed_lsn.
>>
>
> Maybe. I don't really like changing APIs. Can we add new funcs? Make sure
> a NULL LSN can be passed as well.
>

It'd be compatible with the old SQL API and we've already made one such
change in 9.5, to add the parameter to immediately reserve WAL for physical
slots. You just CREATE OR REPLACE FUNCTION it in system_views.sql with a
DEFAULT so the old signature still works.

That works for C-level callers too, unless they try to get clever and
DirectFunctionCall3 it. Even then it might, I don't recall off the top of
my head, but if not we can defend against that too by testing PG_NARGS in
the function its self. This wasn't done for the reserve_wal change and I'm
not convinced it's needed even if the fmgr doesn't take care of the default
parameters - it's not like that's a function that makes much sense to
DirectFunctionCall anyway.

Is the return type of pg_logical_slot_peek_changes() incorrect in the docs?
>

I don't think so, why?



> Time permitting I'd like to add a pg_logical_slot_confirm function, so you
>> can aternate _peek_changes and _confirm, making it possible to get
>> walsender-interface-like behaviour via the SQL interface.
>>
>
> I thought thats what replorigins do.
>

Replication origins provide an easy and efficient way for a PostgreSQL
downstream (client) to track its replay state and position in a reliable
way. If the client is PostgreSQL then you can use replication origins to
track the last flushed LSN, yes. If you're using the walsender interface
you must do so. They are not used by the server-side of logical decoding on
either walsender or SQL interfaces.

If you're using the SQL interface it has its own replay position tracking.
Just before pg_logical_slot_get_changes returns it updates the confirm
position of the slot, as if you'd sent a replay confirmation from the
client on the walsender protocol. The problem is that this happens before
we know the client has received and flushed all the data we just sent it.
The client might not get all (or any) of it if there are network issues,
for example.

Because the replay position can go backwards the client can and must track
the last-safely-replayed position its self, as established earlier, so it
can skip over data it already saw. The problem is that when using the SQL
interface the client can't do the reverse - it can't tell the server "send
me that data again, please, I only got half of it last time because the
network died". The server has advanced the confirmed position and there's
no way to ask for the data again over the SQL interface.

That's why I want to add a param to let it do so, replaying from prior to
the confirm location or skipping past data that'd otherwise be replayed. If
you pass NULL (the default, and the only option in 9.5 or older) for that
param then you get data from the server's last recorded confirm location,
which is the server's best-effort tracking of your replay position from
server-side.

That's also why a pg_logical_slot_confirm function could be handy. It lets
you do a _peek_changes then a _confirm when you know you've flushed them,
prior to the next peek.

I think the docs for pg_replication_slots are wrong about the confirm
location btw, it says data from before the confirm location can't be
replayed, but it can over the walsender protocol. There's just no way to
ask for it over the SQL interface.


-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


Re: [HACKERS] Logical decoding slots can go backwards when used from SQL, docs are wrong

2016-03-11 Thread Simon Riggs
On 11 March 2016 at 08:19, Craig Ringer  wrote:

> Hi all
>
> I think I found a couple of logical decoding issues while writing tests
> for failover slots.
>
> Despite the docs' claim that a logical slot will replay data "exactly
> once", a slot's confirmed_lsn can go backwards and the SQL functions can
> replay the same data more than once.We don't mark a slot as dirty if only
> its confirmed_lsn is advanced, so it isn't flushed to disk. For failover
> slots this means it also doesn't get replicated via WAL. After a master
> crash, or for failover slots after a promote event, the confirmed_lsn will
> go backwards.  Users of the SQL interface must keep track of the safely
> locally flushed slot position themselves and throw the repeated data away.
> Unlike with the walsender protocol it has no way to ask the server to skip
> that data.
>

Yeh, I read that and thought it was wrong. "At least once, in order" is
what we want, how it works and how it should be described.


> Worse, because we don't dirty the slot even a *clean shutdown* causes slot
> confirmed_lsn to go backwards. That's a bug IMO. We should force a flush of
> all slots at the shutdown checkpoint, whether dirty or not, to address it.
>

Given the earlier understanding, above, I'm not sure I see why that must
happen. But I agree it should happen.


> Barring objections I intend to submit a fix to:
>
> - Document that slots can replay data more than once
> - Force flush of all slots on a shutdown checkpoint
>

+1


> Also, pg_logical_slot_get_changes and its _peek_ variant should have a
> param specifying the starting LSN to read and return. If this is lower than
> the restart_lsn but non-null it should ERROR; if it's greater than or equal
> it should use this position instead of starting at the confirmed_lsn.
>

Maybe. I don't really like changing APIs. Can we add new funcs? Make sure a
NULL LSN can be passed as well.

Is the return type of pg_logical_slot_peek_changes() incorrect in the docs?


> Time permitting I'd like to add a pg_logical_slot_confirm function, so you
> can aternate _peek_changes and _confirm, making it possible to get
> walsender-interface-like behaviour via the SQL interface.
>

I thought thats what replorigins do.


> Right now you can't reliably use the SQL interface because _get_changes
> can succeed locally and advance the slot but the client can fail to receive
> all the changes due to network issues etc. Sure, the SQL interface is meant
> mainly for testing, but especially for !postgresql downstreams I strongly
> suspect people will want to use it for "real" work rather than have to
> modify each client driver to support replication protocol extensions.
>

-- 
Simon Riggshttp://www.2ndQuadrant.com/

PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services