Re: Fix slot's xmin advancement and subxact's lost snapshots in decoding.

2018-06-26 Thread Alvaro Herrera
Pushed this to 9.4 - master after some more tinkering.

It occurred to me that it might be better to have
ReorderBufferSetBaseSnapshot do the IncrRefCount instead of expecting
caller to do it.  But I wouldn't backpatch that change, so I refrained.

Thanks for the patch.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: Fix slot's xmin advancement and subxact's lost snapshots in decoding.

2018-06-26 Thread Arseny Sher


Alvaro Herrera  writes:

> I'm struggling with this assert.  I find that test_decoding passes tests
> with this assertion in branch master, but fails in 9.4 - 10.  Maybe I'm
> running the tests wrong (no assertions in master?) but I don't see it.
> It *should* fail ...

Your v3 patch fails for me on freshest master (4d54543efa) in exactly
this assert, it looks like there is something wrong in your setup.

--
Arseny Sher
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company



Re: Fix slot's xmin advancement and subxact's lost snapshots in decoding.

2018-06-26 Thread Alvaro Herrera
On 2018-Jun-26, Arseny Sher wrote:

> Alvaro Herrera  writes:

> > * I'm a bit unsure about the comment atop ReorderBufferSetBaseSnapshot.
> >   Obviously, the bit within the #if 0/#endif I'm going to remove before
> >   push.
> 
> It looks like you've started editing that bit and didn't finish.

Yeah, I left those lines in place as a reminder that I need to finish
editing, wondering if there's any nuance I'm missing that I need to add
to the final version.

> >   I don't understand why it says "Needs to be called before any
> >   changes are added with ReorderBufferQueueChange"; but if you edit that
> >   function and add an assert that the base snapshot is set, it crashes
> >   pretty quickly in the test_decoding tests.  (The supposedly bogus
> >   comment was there before your patch -- I'm not saying your comment
> >   addition is faulty.)
> 
> That's because REORDER_BUFFER_CHANGE_INTERNAL_COMMAND_ID changes are
> queued whenever we have read that xact has modified the catalog,
> regardless of base snapshot existence. Even if there are no changes yet,
> we will need it for correct visibility of catalog, so I am inclined to
> remove the assert and comment or rephrase the latter with 'any
> *data-carrying* changes'.

I'm struggling with this assert.  I find that test_decoding passes tests
with this assertion in branch master, but fails in 9.4 - 10.  Maybe I'm
running the tests wrong (no assertions in master?) but I don't see it.
It *should* fail ...

> > * I also noticed that we're doing subtxn cleanup one by one in both
> >   ReorderBufferAssignChild and ReorderBufferCommitChild, which means the
> >   top-level txn is sought in the hash table over and over, which seems a
> >   bit silly.  Not this patch's problem to fix ...
> 
> There is already one-entry cache in ReorderBufferTXNByXid.

That's useless, because we use two xids: first the parent; then the
subxact.  Looking up the subxact evicts the parent from the one-entry
cache, so when we loop to process next subxact, the parent is no longer
cached :-)

> We could add 'don't cache me' flag to it and use it with subxacts, or
> simply pull top-level reorderbuffer out of loops.

Yeah, but that's an API change.

> I'm fine with the rest of your edits. One more little comment:
> 
> @@ -543,9 +546,10 @@ ReorderBufferQueueChange(ReorderBuffer *rb, 
> TransactionId xid, XLogRecPtr lsn,
>   ReorderBufferTXN *txn;
> 
>   txn = ReorderBufferTXNByXid(rb, xid, true, NULL, lsn, true);
> + Assert(txn->base_snapshot != NULL || txn->is_known_as_subxact);
> 
>   change->lsn = lsn;
> - Assert(InvalidXLogRecPtr != lsn);
> + Assert(!XLogRecPtrIsInvalid(lsn));
>   dlist_push_tail(>changes, >node);
>   txn->nentries++;
>   txn->nentries_mem++;
> 
> Since we do that, probably we should replace all lsn validity checks
> with XLogRectPtrIsInvalid?

I reverted this back to how it was originally.  We can change it as a
style item later in all places where it appears (branch master only),
but modifying only one in a backpatched commit seems poor practice.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: Fix slot's xmin advancement and subxact's lost snapshots in decoding.

2018-06-26 Thread Arseny Sher


Alvaro Herrera  writes:

> Firstly -- this is top-notch detective work, kudos and thanks for the
> patch and test cases.  (I verified that both fail before the code fix.)

Thank you!

> Here's a v3.  I applied a lot of makeup in order to try to understand
> what's going on.  I *think* I have a grasp on the original code and your
> bugfix, not terribly firm I admit.

Well, when I started digging it, I have found logical decoding code
somewhat underdocumented. If you or any other committer will consider
the patch trying to improve the comments, I can prepare one.

> * you don't need to Assert that things are not NULL if you're
>   immediately going to dereference them.

Indeed.

> * I think setting snapshot_base to NULL in ReorderBufferCleanupTXN is
>   pointless, since the struct is gonna be freed shortly afterwards.

Yeah, but it is a general style of the original code, e.g. see
/* cosmetic... */
comments. It probably makes the picture a bit more aesthetic and
consistent, kind of final chord, though I don't mind removing it.

> * I rewrote many comments (both existing and some of the ones your patch
>   adds), and added lots of comments where there were none.

Now the code is nicer, thanks.

> * I'm a bit unsure about the comment atop ReorderBufferSetBaseSnapshot.
>   Obviously, the bit within the #if 0/#endif I'm going to remove before
>   push.

It looks like you've started editing that bit and didn't finish.

>   I don't understand why it says "Needs to be called before any
>   changes are added with ReorderBufferQueueChange"; but if you edit that
>   function and add an assert that the base snapshot is set, it crashes
>   pretty quickly in the test_decoding tests.  (The supposedly bogus
>   comment was there before your patch -- I'm not saying your comment
>   addition is faulty.)

That's because REORDER_BUFFER_CHANGE_INTERNAL_COMMAND_ID changes are
queued whenever we have read that xact has modified the catalog,
regardless of base snapshot existence. Even if there are no changes yet,
we will need it for correct visibility of catalog, so I am inclined to
remove the assert and comment or rephrase the latter with 'any
*data-carrying* changes'.

> * I also noticed that we're doing subtxn cleanup one by one in both
>   ReorderBufferAssignChild and ReorderBufferCommitChild, which means the
>   top-level txn is sought in the hash table over and over, which seems a
>   bit silly.  Not this patch's problem to fix ...

There is already one-entry cache in ReorderBufferTXNByXid. We could add
'don't cache me' flag to it and use it with subxacts, or simply pull
top-level reorderbuffer out of loops.

I'm fine with the rest of your edits. One more little comment:

@@ -543,9 +546,10 @@ ReorderBufferQueueChange(ReorderBuffer *rb, TransactionId 
xid, XLogRecPtr lsn,
ReorderBufferTXN *txn;

txn = ReorderBufferTXNByXid(rb, xid, true, NULL, lsn, true);
+   Assert(txn->base_snapshot != NULL || txn->is_known_as_subxact);

change->lsn = lsn;
-   Assert(InvalidXLogRecPtr != lsn);
+   Assert(!XLogRecPtrIsInvalid(lsn));
dlist_push_tail(>changes, >node);
txn->nentries++;
txn->nentries_mem++;

Since we do that, probably we should replace all lsn validity checks
with XLogRectPtrIsInvalid?

--
Arseny Sher
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company



Re: Fix slot's xmin advancement and subxact's lost snapshots in decoding.

2018-06-25 Thread Alvaro Herrera
Hello

Firstly -- this is top-notch detective work, kudos and thanks for the
patch and test cases.  (I verified that both fail before the code fix.)

Here's a v3.  I applied a lot of makeup in order to try to understand
what's going on.  I *think* I have a grasp on the original code and your
bugfix, not terribly firm I admit.

Some comments

* you don't need to Assert that things are not NULL if you're
  immediately going to dereference them.  The assert is there to make
  the code crash in case it's a NULL pointer, but the subsequent
  dereference is going to have the same effect, so the assert is
  redundant.

* I think setting snapshot_base to NULL in ReorderBufferCleanupTXN is
  pointless, since the struct is gonna be freed shortly afterwards.

* I rewrote many comments (both existing and some of the ones your patch
  adds), and added lots of comments where there were none.

* I'm a bit unsure about the comment atop ReorderBufferSetBaseSnapshot.
  Obviously, the bit within the #if 0/#endif I'm going to remove before
  push.  I don't understand why it says "Needs to be called before any
  changes are added with ReorderBufferQueueChange"; but if you edit that
  function and add an assert that the base snapshot is set, it crashes
  pretty quickly in the test_decoding tests.  (The supposedly bogus
  comment was there before your patch -- I'm not saying your comment
  addition is faulty.)

* I also noticed that we're doing subtxn cleanup one by one in both
  ReorderBufferAssignChild and ReorderBufferCommitChild, which means the
  top-level txn is sought in the hash table over and over, which seems a
  bit silly.  Not this patch's problem to fix ...

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
diff --git a/contrib/test_decoding/Makefile b/contrib/test_decoding/Makefile
index 1d601d8144..afcab930f7 100644
--- a/contrib/test_decoding/Makefile
+++ b/contrib/test_decoding/Makefile
@@ -50,7 +50,8 @@ regresscheck-install-force: | submake-regress 
submake-test_decoding temp-install
$(pg_regress_installcheck) \
$(REGRESSCHECKS)
 
-ISOLATIONCHECKS=mxact delayed_startup ondisk_startup concurrent_ddl_dml
+ISOLATIONCHECKS=mxact delayed_startup ondisk_startup concurrent_ddl_dml \
+   oldest_xmin snapshot_transfer
 
 isolationcheck: | submake-isolation submake-test_decoding temp-install
$(pg_isolation_regress_check) \
diff --git a/contrib/test_decoding/expected/oldest_xmin.out 
b/contrib/test_decoding/expected/oldest_xmin.out
new file mode 100644
index 00..d09342c4be
--- /dev/null
+++ b/contrib/test_decoding/expected/oldest_xmin.out
@@ -0,0 +1,27 @@
+Parsed test spec with 2 sessions
+
+starting permutation: s0_begin s0_getxid s1_begin s1_insert s0_alter s0_commit 
s0_checkpoint s0_get_changes s1_commit s0_vacuum s0_get_changes
+step s0_begin: BEGIN;
+step s0_getxid: SELECT txid_current() IS NULL;
+?column?   
+
+f  
+step s1_begin: BEGIN;
+step s1_insert: INSERT INTO harvest VALUES ((1, 2, 3));
+step s0_alter: ALTER TYPE basket DROP ATTRIBUTE mangos;
+step s0_commit: COMMIT;
+step s0_checkpoint: CHECKPOINT;
+step s0_get_changes: SELECT data FROM 
pg_logical_slot_get_changes('isolation_slot', NULL, NULL, 'include-xids', '0', 
'skip-empty-xacts', '1');
+data   
+
+step s1_commit: COMMIT;
+step s0_vacuum: VACUUM FULL;
+step s0_get_changes: SELECT data FROM 
pg_logical_slot_get_changes('isolation_slot', NULL, NULL, 'include-xids', '0', 
'skip-empty-xacts', '1');
+data   
+
+BEGIN  
+table public.harvest: INSERT: fruits[basket]:'(1,2,3)'
+COMMIT 
+?column?   
+
+stop   
diff --git a/contrib/test_decoding/expected/snapshot_transfer.out 
b/contrib/test_decoding/expected/snapshot_transfer.out
new file mode 100644
index 00..87bed03f76
--- /dev/null
+++ b/contrib/test_decoding/expected/snapshot_transfer.out
@@ -0,0 +1,49 @@
+Parsed test spec with 2 sessions
+
+starting permutation: s0_begin s0_begin_sub0 s0_log_assignment 
s0_sub_get_base_snap s1_produce_new_snap s0_insert s0_end_sub0 s0_commit 
s0_get_changes
+step s0_begin: BEGIN;
+step s0_begin_sub0: SAVEPOINT s0;
+step s0_log_assignment: SELECT txid_current() IS NULL;
+?column?   
+
+f  
+step s0_sub_get_base_snap: INSERT INTO dummy VALUES (0);
+step s1_produce_new_snap: ALTER TABLE harvest ADD COLUMN mangos int;
+step s0_insert: INSERT INTO harvest VALUES (1, 2, 3);
+step s0_end_sub0: RELEASE SAVEPOINT s0;
+step s0_commit: COMMIT;
+step s0_get_changes: SELECT data FROM 
pg_logical_slot_get_changes('isolation_slot', NULL, NULL, 'include-xids', '0', 
'skip-empty-xacts', '1');
+data   
+
+BEGIN  
+table public.dummy: INSERT: i[integer]:0
+table public.harvest: INSERT: apples[integer]:1 pears[integer]:2 
mangos[integer]:3
+COMMIT 
+?column?   
+
+stop   
+
+starting permutation: s0_begin s0_begin_sub0 s0_log_assignment s0_begin_sub1 

Re: Fix slot's xmin advancement and subxact's lost snapshots in decoding.

2018-06-21 Thread Arseny Sher
Hello,

Thank you for your time.

Alvaro Herrera  writes:

> On 2018-Jun-11, Antonin Houska wrote:
>
>>
>> One comment about the coding: since you introduce a new transaction list and
>> it's sorted by LSN, I think you should make the function AssertTXNLsnOrder()
>> more generic and use it to check the by_base_snapshot_lsn list too. For
>> example call it after the list insertion and deletion in
>> ReorderBufferPassBaseSnapshot().

Ok, probably this makes sense. Updated patch is attached.

>> Also I think it makes no harm if you split the diff into two, although both
>> fixes use the by_base_snapshot_lsn field.
>
> Please don't.

Yeah, I don't think we should bother with that.

--
Arseny Sher
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

>From 61b4b0a89c95f8912729c54c013b64033f88fccf Mon Sep 17 00:00:00 2001
From: Arseny Sher 
Date: Thu, 21 Jun 2018 10:29:07 +0300
Subject: [PATCH] Fix slot's xmin advancement and subxact's lost snapshots in
 decoding

There are two in some way related bugs here. First, xmin of logical slots was
advanced too early. During xl_running_xacts processing, xmin of slot was set to
oldest running xid in the record, which is wrong: actually snapshots which will
be used for not-yet-replayed at this moment xacts might consider older xacts as
running too. The problem wasn't noticed earlier because DDL which allows to
delete tuple (set xmax) while some another not yet committed xact looks at it
is pretty rare, if not unique: e.g. all forms of ALTER TABLE which change schema
acquire ACCESS EXCLUSIVE lock conflicting with any inserts. Included test
case uses ALTER of a composite type, which doesn't have such interlocking.

To deal with this, we must be able to quickly retrieve oldest xmin
(oldest running xid among all assigned snapshots) from ReorderBuffer. Proposed
fix adds yet another list of ReorderBufferTXNs to it, where xacts are sorted by
LSN of the base snapshot, because
 * Plain usage of current ReorderBufferGetOldestTXN (to get base snapshot's
   xmin) is wrong, because order by LSN of the first record is *not* the same as
   order by LSN of record on which base snapshot was assigned: many record types
   don't need the snapshot, e.g. xl_xact_assignment. We could remember
   snapbuilder's xmin during processing of the first record, but that's too
   conservative and we would keep more tuples from vacuuming than needed.
 * On the other hand, we can't fully replace existing order by LSN of the first
   change with LSN of the base snapshot, because if we do that, WAL records
   which didn't forced receival of base snap might be recycled before we
   replay the xact, as ReorderBufferGetOldestTXN determines which LSN still must
   be kept.

Second issue concerns subxacts. Currently SnapBuildDistributeNewCatalogSnapshot
never spares a snapshot to known subxact. It means that if toplevel doesn't have
base snapshot (never did writes), there was an assignment record (subxact is
known) and subxact is writing, no new snapshots are being queued. Probably the
simplest fix would be to employ freshly baked by_base_snapshot_lsn list and just
distribute snapshot to all xacts with base snapshot whether they are subxacts or
not. However, in this case we would queue unneccessary snapshot to all already
known subxacts. To avoid this, in this patch base snapshot of known subxact
is immediately passed to the toplevel as soon as we learn the kinship / base
snap assigned -- we have to do that anyway in ReorderBufferCommitChild; new
snaps are distributed only to top-level xacts (or unknown subxacts) as before.

Also, minor memory leak is fixed: counter of toplevel's old base
snapshot was not decremented when snap has been passed from child.
---
 contrib/test_decoding/Makefile |   3 +-
 contrib/test_decoding/expected/oldest_xmin.out |  27 +++
 .../test_decoding/expected/subxact_snap_pass.out   |  49 +
 contrib/test_decoding/specs/oldest_xmin.spec   |  37 
 contrib/test_decoding/specs/subxact_snap_pass.spec |  42 +
 src/backend/replication/logical/reorderbuffer.c| 208 +++--
 src/backend/replication/logical/snapbuild.c|  15 +-
 src/include/replication/reorderbuffer.h|  15 +-
 8 files changed, 330 insertions(+), 66 deletions(-)
 create mode 100644 contrib/test_decoding/expected/oldest_xmin.out
 create mode 100644 contrib/test_decoding/expected/subxact_snap_pass.out
 create mode 100644 contrib/test_decoding/specs/oldest_xmin.spec
 create mode 100644 contrib/test_decoding/specs/subxact_snap_pass.spec

diff --git a/contrib/test_decoding/Makefile b/contrib/test_decoding/Makefile
index 1d601d8144..05d7766cd5 100644
--- a/contrib/test_decoding/Makefile
+++ b/contrib/test_decoding/Makefile
@@ -50,7 +50,8 @@ regresscheck-install-force: | submake-regress submake-test_decoding temp-install
 	$(pg_regress_installcheck) \
 	$(REGRESS

Re: Fix slot's xmin advancement and subxact's lost snapshots in decoding.

2018-06-18 Thread Alvaro Herrera
On 2018-Jun-11, Antonin Houska wrote:

> Arseny Sher  wrote:
> > Please see detailed description of the issues, tests which reproduce
> > them and fixes in the attached patch.
> 
> I've played with your tests and gdb for a while, both w/o and with your
> patch. I think I can understand both problems. I could not invent simpler way
> to fix them.
> 
> One comment about the coding: since you introduce a new transaction list and
> it's sorted by LSN, I think you should make the function AssertTXNLsnOrder()
> more generic and use it to check the by_base_snapshot_lsn list too. For
> example call it after the list insertion and deletion in
> ReorderBufferPassBaseSnapshot().

I've been going over this patch also, and I've made a few minor edits of
the patch and the existing code as I come to understand what it's all
about.

> Also I think it makes no harm if you split the diff into two, although both
> fixes use the by_base_snapshot_lsn field.

Please don't.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: Fix slot's xmin advancement and subxact's lost snapshots in decoding.

2018-06-11 Thread Antonin Houska
Arseny Sher  wrote:
> Please see detailed description of the issues, tests which reproduce
> them and fixes in the attached patch.

I've played with your tests and gdb for a while, both w/o and with your
patch. I think I can understand both problems. I could not invent simpler way
to fix them.

One comment about the coding: since you introduce a new transaction list and
it's sorted by LSN, I think you should make the function AssertTXNLsnOrder()
more generic and use it to check the by_base_snapshot_lsn list too. For
example call it after the list insertion and deletion in
ReorderBufferPassBaseSnapshot().

Also I think it makes no harm if you split the diff into two, although both
fixes use the by_base_snapshot_lsn field.

-- 
Antonin Houska
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26, A-2700 Wiener Neustadt
Web: https://www.cybertec-postgresql.com



Re: Fix slot's xmin advancement and subxact's lost snapshots in decoding.

2018-05-25 Thread Arseny Sher

Michael Paquier  writes:

> but for now I would recommend to register this patch to the next
> commit fest under the category "Bug Fixes" so as it does not fall into
> the cracks: https://commitfest.postgresql.org/18/
>
> I have added an entry to the open items in the section for older bugs:
> https://wiki.postgresql.org/wiki/PostgreSQL_11_Open_Items#Older_Bugs
> However this list tends to be... Er... Ignored.
>

Thank you, Michael. I have created the commitfest entry.

--
Arseny Sher
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company



Re: Fix slot's xmin advancement and subxact's lost snapshots in decoding.

2018-04-16 Thread Michael Paquier
On Sun, Apr 08, 2018 at 08:46:04AM +0300, Arseny Sher wrote:
> I've discovered a couple of bugs in logical decoding code, both leading
> to incorrect decoding results in somewhat rare cases. First, xmin of
> slots is advanced too early. This affects the results only when
> interlocking allows to perform DDL concurrently with looking at the
> schema. In fact, I was not aware about such DDL until at
> https://www.postgresql.org/message-id/flat/87tvu0p0jm.fsf%40ars-thinkpad#87tvu0p0jm.fsf@ars-thinkpad
> I raised this question and Andres pointed out ALTER of composite
> types. Probably there are others, I am not sure; it would be interesting
> to know them.
> 
> Another problem is that new snapshots are never queued to known
> subxacts. It means decoding results can be wrong if toplevel doesn't
> write anything while subxact does.

Most people are recovering from the last commit fest, where many patches
have been discussed and dealt with.  I have not looked in details at
your patch so I cannot say is legit or not, but for now I would
recommend to register this patch to the next commit fest under the
category "Bug Fixes" so as it does not fall into the cracks:
https://commitfest.postgresql.org/18/

I have added an entry to the open items in the section for older bugs:
https://wiki.postgresql.org/wiki/PostgreSQL_11_Open_Items#Older_Bugs
However this list tends to be... Er... Ignored.

> Please see detailed description of the issues, tests which reproduce
> them and fixes in the attached patch.

Those are always good things to have in a patch.
--
Michael


signature.asc
Description: PGP signature


Re: Fix slot's xmin advancement and subxact's lost snapshots in decoding.

2018-04-16 Thread Arseny Sher
(delicate ping)



Fix slot's xmin advancement and subxact's lost snapshots in decoding.

2018-04-07 Thread Arseny Sher
Hello,

I've discovered a couple of bugs in logical decoding code, both leading
to incorrect decoding results in somewhat rare cases. First, xmin of
slots is advanced too early. This affects the results only when
interlocking allows to perform DDL concurrently with looking at the
schema. In fact, I was not aware about such DDL until at
https://www.postgresql.org/message-id/flat/87tvu0p0jm.fsf%40ars-thinkpad#87tvu0p0jm.fsf@ars-thinkpad
I raised this question and Andres pointed out ALTER of composite
types. Probably there are others, I am not sure; it would be interesting
to know them.

Another problem is that new snapshots are never queued to known
subxacts. It means decoding results can be wrong if toplevel doesn't
write anything while subxact does.

Please see detailed description of the issues, tests which reproduce
them and fixes in the attached patch.

--
Arseny Sher
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

>From 0d03ef172a29ce64a6c5c26f484f0f3186895125 Mon Sep 17 00:00:00 2001
From: Arseny Sher <sher-...@yandex.ru>
Date: Sat, 7 Apr 2018 08:22:30 +0300
Subject: [PATCH] Fix slot's xmin advancement and subxact's lost snapshots in
 decoding.

There are two in some way related bugs here. First, xmin of logical slots was
advanced too early. During xl_running_xacts processing, xmin of slot was set to
oldest running xid in the record, which is wrong: actually snapshots which will
be used for not-yet-replayed at this moment xacts might consider older xacts as
running too. The problem wasn't noticed earlier because DDL which allows to
delete tuple (set xmax) while some another not yet committed xact looks at it
is pretty rare, if not unique: e.g. all forms of ALTER TABLE which change schema
acquire ACCESS EXCLUSIVE lock conflicting with any inserts. Included test
case uses ALTER of a composite type, which doesn't have such interlocking.

To deal with this, we must be able to quickly retrieve oldest xmin
(oldest running xid among all assigned snapshots) from ReorderBuffer. Proposed
fix adds yet another list of ReorderBufferTXNs to it, where xacts are sorted by
LSN of the base snapshot, because
 * Plain usage of current ReorderBufferGetOldestTXN (to get base snapshot's
   xmin) is wrong, because order by LSN of the first record is *not* the same as
   order by LSN of record on which base snapshot was assigned: many record types
   don't need the snapshot, e.g. xl_xact_assignment. We could remember
   snapbuilder's xmin during processing of the first record, but that's too
   conservative and we would keep more tuples from vacuuming than needed.
 * On the other hand, we can't fully replace existing order by LSN of the first
   change with LSN of the base snapshot, because if we do that, WAL records
   which didn't forced receival of base snap might be recycled before we
   replay the xact, as ReorderBufferGetOldestTXN determines which LSN still must
   be kept.

Second issue concerns subxacts. Currently SnapBuildDistributeNewCatalogSnapshot
never spares a snapshot to known subxact. It means that if toplevel doesn't have
base snapshot (never did writes), there was an assignment record (subxact is
known) and subxact is writing, no new snapshots are being queued. Probably the
simplest fix would be to employ freshly baked by_base_snapshot_lsn list and just
distribute snapshot to all xacts with base snapshot whether they are subxacts or
not. However, in this case we would queue unneccessary snapshot to all already
known subxacts. To avoid this, in this patch base snapshot of known subxact
is immediately passed to the toplevel as soon as we learn the kinship / base
snap assigned -- we have to do that anyway in ReorderBufferCommitChild; new
snaps are distributed only to top-level xacts (or unknown subxacts) as before.

Also, minor memory leak is fixed: counter of toplevel's old base
snapshot was not decremented when snap has been passed from child.
---
 contrib/test_decoding/Makefile |   3 +-
 contrib/test_decoding/expected/oldest_xmin.out |  27 +++
 .../test_decoding/expected/subxact_snap_pass.out   |  49 ++
 contrib/test_decoding/specs/oldest_xmin.spec   |  37 +
 contrib/test_decoding/specs/subxact_snap_pass.spec |  42 +
 src/backend/replication/logical/reorderbuffer.c| 183 ++---
 src/backend/replication/logical/snapbuild.c|  15 +-
 src/include/replication/reorderbuffer.h|  15 +-
 8 files changed, 304 insertions(+), 67 deletions(-)
 create mode 100644 contrib/test_decoding/expected/oldest_xmin.out
 create mode 100644 contrib/test_decoding/expected/subxact_snap_pass.out
 create mode 100644 contrib/test_decoding/specs/oldest_xmin.spec
 create mode 100644 contrib/test_decoding/specs/subxact_snap_pass.spec

diff --git a/contrib/test_decoding/Makefile b/contrib/test_decoding/Makefile
index 6c18189d9d..c696288d67 100644
--- a/contrib/test_decoding/Makefile
+++ b/contrib/test_decoding/Makefile
@@ -5