On 2019-Jul-26, Andres Freund wrote:
> 2) We could simply assign the subtransaction to the parent using
> ReorderBufferAssignChild() in SnapBuildProcessNewCid() or it's
> caller. That ought to also fix the bug
>
> I also has the advantage that we can save some memory in transactions
> that have some, but fewer than the ASSIGNMENT limit subtransactions,
> because it allows us to avoid having a separate base snapshot for
> them (c.f. ReorderBufferTransferSnapToParent()).
I'm not sure I understood this suggestion correctly. I first tried with
this, which seems the simplest rendition:
--- a/src/backend/replication/logical/snapbuild.c
+++ b/src/backend/replication/logical/snapbuild.c
@@ -772,6 +772,12 @@ SnapBuildProcessNewCid(SnapBuild *builder, TransactionId
xid,
{
CommandId cid;
+ if ((SnapBuildCurrentState(builder) < SNAPBUILD_FULL_SNAPSHOT) &&
+ (xlrec->top_xid != xid))
+ {
+ ReorderBufferAssignChild(builder->reorder, xlrec->top_xid, xid,
lsn);
+ }
+
/*
* we only log new_cid's if a catalog tuple was modified, so mark the
* transaction as containing catalog modifications
test_decoding's tests pass with that, but if I try the example script
provided by Ildar, all pgbench clients die with this:
client 19 script 1 aborted in command 1 query 0: ERROR: subtransaction logged
without previous top-level txn record
I thought I would create the main txn before calling AssignChild in
snapbuild; however, ReorderBufferTXNByXid is static in reorderbuffer.c.
So that seems out. My next try was to remove the elog() that was
causing the failure ... but that leads pretty quickly to a crash with
this backtrace:
#2 0x00005653241fb823 in ExceptionalCondition
(conditionName=conditionName@entry=0x5653243c1960 "!(prev_first_lsn <
cur_txn->first_lsn)",
errorType=errorType@entry=0x565324250596 "FailedAssertion",
fileName=fileName@entry=0x5653243c18e8
"/pgsql/source/master/src/backend/replication/logical/reorderbuffer.c",
lineNumber=lineNumber@entry=680) at
/pgsql/source/master/src/backend/utils/error/assert.c:54
#3 0x0000565324062a84 in AssertTXNLsnOrder (rb=rb@entry=0x565326304fa8)
at /pgsql/source/master/src/backend/replication/logical/reorderbuffer.c:680
#4 0x0000565324062e39 in ReorderBufferTXNByXid (rb=rb@entry=0x565326304fa8,
xid=<optimized out>, xid@entry=185613, create=create@entry=true,
is_new=is_new@entry=0x0, lsn=lsn@entry=2645271944,
create_as_top=create_as_top@entry=true)
at /pgsql/source/master/src/backend/replication/logical/reorderbuffer.c:559
#5 0x0000565324067365 in ReorderBufferAddNewTupleCids (rb=0x565326304fa8,
xid=185613, lsn=lsn@entry=2645271944, node=..., tid=..., cmin=0,
cmax=4294967295, combocid=4294967295) at
/pgsql/source/master/src/backend/replication/logical/reorderbuffer.c:2100
#6 0x0000565324069451 in SnapBuildProcessNewCid (builder=0x56532630afd8,
xid=185614, lsn=2645271944, xlrec=0x5653262efc78)
at /pgsql/source/master/src/backend/replication/logical/snapbuild.c:787
Now this failure goes away if I relax the < to <= in the
complained-about line ... but at this point it's two sanity checks that
I've lobotomized in order to get this to run at all. Not really
comfortable with that.
--
Álvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
diff --git a/src/backend/replication/logical/reorderbuffer.c b/src/backend/replication/logical/reorderbuffer.c
index 5fa3d7323e..55538fa44c 100644
--- a/src/backend/replication/logical/reorderbuffer.c
+++ b/src/backend/replication/logical/reorderbuffer.c
@@ -677,7 +677,7 @@ AssertTXNLsnOrder(ReorderBuffer *rb)
/* Current initial LSN must be strictly higher than previous */
if (prev_first_lsn != InvalidXLogRecPtr)
- Assert(prev_first_lsn < cur_txn->first_lsn);
+ Assert(prev_first_lsn <= cur_txn->first_lsn);
/* known-as-subtxn txns must not be listed */
Assert(!cur_txn->is_known_as_subxact);
@@ -778,9 +778,6 @@ ReorderBufferAssignChild(ReorderBuffer *rb, TransactionId xid,
txn = ReorderBufferTXNByXid(rb, xid, true, &new_top, lsn, true);
subtxn = ReorderBufferTXNByXid(rb, subxid, true, &new_sub, lsn, false);
- if (new_top && !new_sub)
- elog(ERROR, "subtransaction logged without previous top-level txn record");
-
if (!new_sub)
{
if (subtxn->is_known_as_subxact)
diff --git a/src/backend/replication/logical/snapbuild.c b/src/backend/replication/logical/snapbuild.c
index dc64b1e0c2..c8b1fcd96c 100644
--- a/src/backend/replication/logical/snapbuild.c
+++ b/src/backend/replication/logical/snapbuild.c
@@ -772,6 +772,10 @@ SnapBuildProcessNewCid(SnapBuild *builder, TransactionId xid,
{
CommandId cid;
+ if (SnapBuildCurrentState(builder) < SNAPBUILD_FULL_SNAPSHOT &&
+ xlrec->top_xid != xid)
+ ReorderBufferAssignChild(builder->reorder, xlrec->top_xid, xid, lsn);
+
/*
* we only log new_cid's if a catalog tuple was modified, so mark the
* transaction as containing catalog modifications