Thanks Tomas for the updated patches.

Here are my comments on 0006 patch as well as 0002 patch.

On Wed, Jul 19, 2023 at 4:23 PM Tomas Vondra
<tomas.von...@enterprisedb.com> wrote:
>
> I think this is an accurate description of what the current patch does.
> And I think it's a reasonable behavior.
>
> My point is that if this gets released in PG17, it'll be difficult to
> change, even if it does not change on-disk format.
>

Yes. I agree. And I don't see any problem even if we are not able to change it.

>
> I need to think behavior about this a bit more, and maybe check how
> difficult would be implementing it.

Ok.

In most of the comments and in documentation, there are some phrases
which do not look accurate.

Change to a sequence is being refered to as "sequence increment". While
ascending sequences are common, PostgreSQL supports descending sequences as
well. The changes there will be decrements. But that's not the only case. A
sequence may be restarted with an older value, in which case the change could
increment or a decrement. I think correct usage is 'changes to sequence' or
'sequence changes'.

Sequence being assigned a new relfilenode is referred to as sequence
being created. This is confusing. When an existing sequence is ALTERed, we
will not "create" a new sequence but we will "create" a new relfilenode and
"assign" it to that sequence.

PFA such edits in 0002 and 0006 patches. Let me know if those look
correct. I think we
need similar changes to the documentation and comments in other places.

>
> I did however look at the proposed alternative to the "created" flag.
> The attached 0006 part ditches the flag with XLOG_SMGR_CREATE decoding.
> The smgr_decode code needs a review (I'm not sure the
> skipping/fast-forwarding part is correct), but it seems to be working
> fine overall, although we need to ensure the WAL record has the correct XID.
>

Briefly describing the patch. When decoding a XLOG_SMGR_CREATE WAL
record, it adds the relfilenode mentioned in it to the sequences hash.
When decoding a sequence change record, it checks whether the
relfilenode in the WAL record is in hash table. If it is the sequence
changes is deemed transactional otherwise non-transactional. The
change looks good to me. It simplifies the logic to decide whether a
sequence change is transactional or not.

In sequence_decode() we skip sequence changes when fast forwarding.
Given that smgr_decode() is only to supplement sequence_decode(), I
think it's correct to do the same in smgr_decode() as well. Simillarly
skipping when we don't have full snapshot.

Some minor comments on 0006 patch

+    /* make sure the relfilenode creation is associated with the XID */
+    if (XLogLogicalInfoActive())
+        GetCurrentTransactionId();

I think this change is correct and is inline with similar changes in 0002. But
I looked at other places from where DefineRelation() is called. For regular
tables it is called from ProcessUtilitySlow() which in turn does not call
GetCurrentTransactionId(). I am wondering whether we are just discovering a
class of bugs caused by not associating an xid with a newly created
relfilenode.

+    /*
+     * If we don't have snapshot or we are just fast-forwarding, there is no
+     * point in decoding changes.
+     */
+    if (SnapBuildCurrentState(builder) < SNAPBUILD_FULL_SNAPSHOT ||
+        ctx->fast_forward)
+        return;

This code block is repeated.

+void
+ReorderBufferAddRelFileLocator(ReorderBuffer *rb, TransactionId xid,
+                               RelFileLocator rlocator)
+{
    ... snip ...
+
+    /* sequence changes require a transaction */
+    if (xid == InvalidTransactionId)
+        return;

IIUC, with your changes in DefineSequence() in this patch, this should not
happen. So this condition will never be true. But in case it happens, this code
will not add the relfilelocation to the hash table and we will deem the
sequence change as non-transactional. Isn't it better to just throw an error
and stop replication if that (ever) happens?

Also some comments on 0002 patch

@@ -405,8 +405,19 @@ fill_seq_fork_with_data(Relation rel, HeapTuple
tuple, ForkNumber forkNum)

     /* check the comment above nextval_internal()'s equivalent call. */
     if (RelationNeedsWAL(rel))
+    {
         GetTopTransactionId();

+        /*
+         * Make sure the subtransaction has a XID assigned, so that
the sequence
+         * increment WAL record is properly associated with it. This
matters for
+         * increments of sequences created/altered in the
transaction, which are
+         * handled as transactional.
+         */
+        if (XLogLogicalInfoActive())
+            GetCurrentTransactionId();
+    }
+

I think we should separately commit the changes which add a call to
GetCurrentTransactionId(). That looks like an existing bug/anomaly
which can stay irrespective of this patch.

+    /*
+     * To support logical decoding of sequences, we require the sequence
+     * callback. We decide it here, but only check it later in the wrappers.
+     *
+     * XXX Isn't it wrong to define only one of those callbacks? Say we
+     * only define the stream_sequence_cb() - that may get strange results
+     * depending on what gets streamed. Either none or both?
+     *
+     * XXX Shouldn't sequence be defined at slot creation time, similar
+     * to two_phase? Probably not.
+     */

Do you intend to keep these XXX's as is? My previous comments on this comment
block are in [1].

In fact, given that whether or not sequences are replicated is decided by the
protocol version, do we really need LogicalDecodingContext::sequences? Drawing
parallel with WAL messages, I don't think it's needed.

[1] 
https://www.postgresql.org/message-id/CAExHW5vScYKKb0RZoiNEPfbaQ60hihfuWeLuZF4JKrwPJXPcUw%40mail.gmail.com

-- 
Best Wishes,
Ashutosh Bapat
From 9be6a6be004f71c80b2f88fd459207866ade061e Mon Sep 17 00:00:00 2001
From: Ashutosh Bapat <ashutosh.ba...@enterprisedb.com>
Date: Tue, 18 Jul 2023 19:24:17 +0530
Subject: [PATCH 7/7] Edits on 0002 patch

---
 src/backend/replication/logical/decode.c      |  2 +-
 src/backend/replication/logical/logical.c     |  8 +--
 .../replication/logical/reorderbuffer.c       | 56 +++++++++----------
 3 files changed, 32 insertions(+), 34 deletions(-)

diff --git a/src/backend/replication/logical/decode.c b/src/backend/replication/logical/decode.c
index e08052617d..e24ccd9af6 100644
--- a/src/backend/replication/logical/decode.c
+++ b/src/backend/replication/logical/decode.c
@@ -1412,7 +1412,7 @@ sequence_decode(LogicalDecodingContext *ctx, XLogRecordBuffer *buf)
 	/*
 	 * Should we handle the sequence increment as transactional or not?
 	 *
-	 * If the sequence was created in a still-running transaction, treat
+	 * If the sequence was assigned a new relfilenode in the transaction being decoded, treat
 	 * it as transactional and queue the increments. Otherwise it needs
 	 * to be treated as non-transactional, in which case we send it to
 	 * the plugin right away.
diff --git a/src/backend/replication/logical/logical.c b/src/backend/replication/logical/logical.c
index d2ff674e32..74ba6fd861 100644
--- a/src/backend/replication/logical/logical.c
+++ b/src/backend/replication/logical/logical.c
@@ -231,10 +231,10 @@ StartupDecodingContext(List *output_plugin_options,
 
 	/*
 	 * To support streaming, we require start/stop/abort/commit/change
-	 * callbacks. The message and truncate callbacks are optional, similar to
-	 * regular output plugins. We however enable streaming when at least one
-	 * of the methods is enabled so that we can easily identify missing
-	 * methods.
+	 * callbacks. The message, sequence and truncate callbacks are optional,
+	 * similar to regular output plugins. We however enable streaming when at
+	 * least one of the methods is enabled so that we can easily identify
+	 * missing methods.
 	 *
 	 * We decide it here, but only check it later in the wrappers.
 	 */
diff --git a/src/backend/replication/logical/reorderbuffer.c b/src/backend/replication/logical/reorderbuffer.c
index d0bcc5a2f8..6f5678fd2f 100644
--- a/src/backend/replication/logical/reorderbuffer.c
+++ b/src/backend/replication/logical/reorderbuffer.c
@@ -77,36 +77,33 @@
  *	  a bit more memory to the oldest subtransactions, because it's likely
  *	  they are the source for the next sequence of changes.
  *
- *	  When decoding sequences, we differentiate between a sequences created
- *	  in a (running) transaction, and sequences created in other (already
- *	  committed) transactions. Changes for sequences created in the same
- *	  top-level transaction are treated as "transactional" i.e. just like
- *	  any other change from that transaction (and discarded in case of a
- *	  rollback). Changes for sequences created earlier are treated as not
- *	  transactional - are processed immediately, as if performed outside
- *	  any transaction (and thus not rolled back).
- *
+ *    When decoding sequences, we differentiate between changes to the
+ *    sequences assigned a new relfilenode in in a transaction being decoded,
+ *    and the changes to the sequences created in other (already committed)
+ *    transactions. First type of changes are treated as "transactional" i.e.
+ *    just like any other change from that transaction (and discarded in case
+ *    of a rollback). Second type of changes are treated as non-transactional.
+ *    They are processed immediately, as if performed outside any transaction
+ *    (and thus not rolled back).
+ *   
  *	  This mixed behavior is necessary - sequences are non-transactional
  *	  (e.g. ROLLBACK does not undo the sequence increments). But for new
  *	  sequences, we need to handle them in a transactional way, because if
  *	  we ever get some DDL support, the sequence won't exist until the
- *	  transaction gets applied. So we need to ensure the increments don't
- *	  happen until the sequence gets created.
- *
- *	  To differentiate which sequences are "old" and which were created
- *	  in a still-running transaction, we track sequences created in running
- *	  transactions in a hash table. Sequences are identified by relfilelocator,
- *	  and we track XID of the (sub)transaction that created it. This means
- *	  that if a transaction does something that changes the relfilelocator
- *	  (like an alter / reset of a sequence), the new relfilelocator will be
- *	  treated as if created in the transaction. The list of sequences gets
- *	  discarded when the transaction completes (commit/rollback).
- *
- *	  We don't use the XID to check if it's the same top-level transaction.
- *	  It's enough to know it was created in an in-progress transaction,
- *	  and we know it must be the current one because otherwise it wouldn't
- *	  see the sequence object.
+ *	  transaction gets applied. So we need to ensure the changes don't
+ *	  happen until the sequence gets created. Similarly when a new relfilenode is assigned to a sequence (because of a DDL) the change does not become visible till the transaction gets committed. It the transaction gets aborted it will never be visible.
  *
+ *    To differentiate between the changes to the sequences, we track sequences
+ *    which are assigned a new relfilenode in transactions being decoded in a
+ *    hash table.  We also track XID of the (sub)transaction that assigned the
+ *    new relfilenode. The list of sequence relfilenodes gets discarded when
+ *    the transaction completes (commit/rollback).
+ *   
+ *    We don't use the XID to check if it's the same top-level transaction.
+ *    It's enough to know it was created in the transaction being decoded, and
+ *    we know it must be the current one because otherwise it wouldn't see the
+ *    sequence object.
+ *   
  *	  The XID may be valid even for non-transactional sequences - we simply
  *	  keep the XID logged to WAL, it's up to the reorderbuffer to decide if
  *	  the increment is transactional.
@@ -953,11 +950,12 @@ ReorderBufferQueueMessage(ReorderBuffer *rb, TransactionId xid,
 }
 
 /*
- * Treat the sequence increment as transactional?
+ * Treat the sequence change as transactional?
  *
- * The hash table tracks all sequences created in in-progress transactions,
- * so we simply do a lookup (the sequence is identified by relfilende). If
- * we find a match, the increment should be handled as transactional.
+ * The hash table tracks all sequences which are assigned a new relfilenode in
+ * the transaction being decoded, so we simply do a lookup (the sequence is
+ * identified by relfilende). If we find a match, the change should be handled
+ * as transactional.
  */
 bool
 ReorderBufferSequenceIsTransactional(ReorderBuffer *rb,
-- 
2.25.1

From 3da56bd0dd80a27e19b67d18648a4cd958883f65 Mon Sep 17 00:00:00 2001
From: Ashutosh Bapat <ashutosh.ba...@enterprisedb.com>
Date: Thu, 20 Jul 2023 12:46:50 +0530
Subject: [PATCH 8/8] Edit on patch 0006

---
 src/backend/replication/logical/decode.c | 23 ++++++++++++-----------
 1 file changed, 12 insertions(+), 11 deletions(-)

diff --git a/src/backend/replication/logical/decode.c b/src/backend/replication/logical/decode.c
index e24ccd9af6..e219a29469 100644
--- a/src/backend/replication/logical/decode.c
+++ b/src/backend/replication/logical/decode.c
@@ -1441,19 +1441,20 @@ sequence_decode(LogicalDecodingContext *ctx, XLogRecordBuffer *buf)
 /*
  * Handle sequence decode
  *
- * Decoding sequences is a bit tricky, because while most sequence actions
- * are non-transactional (not subject to rollback), some need to be handled
- * as transactional.
+ * Decoding changes to sequences is a bit tricky, because while most sequence
+ * actions are non-transactional (not subject to rollback), some need to be
+ * handled as transactional.
  *
- * By default, a sequence increment is non-transactional - we must not queue
- * it in a transaction as other changes, because the transaction might get
- * rolled back and we'd discard the increment. The downstream would not be
- * notified about the increment, which is wrong.
+ * By default, a sequence change is non-transactional - we must not queue it in
+ * a transaction as other changes, because the transaction might get rolled back
+ * and we'd discard the increment. The downstream would not be notified about
+ * the increment, which is wrong.
  *
- * On the other hand, the sequence may be created in a transaction. In this
- * case we *should* queue the change as other changes in the transaction,
- * because we don't want to send the increments for unknown sequence to the
- * plugin - it might get confused about which sequence it's related to etc.
+ * On the other hand, the relfilenode associated with the sequence may be
+ * changed in a transaction. In this case we *should* queue the change as other
+ * changes in the transaction, because we don't want to send the sequence
+ * changes, which may be rolled back, to the plugin before the transaction is
+ * committed. 
  */
 void
 smgr_decode(LogicalDecodingContext *ctx, XLogRecordBuffer *buf)
-- 
2.25.1

Reply via email to