On Thu, Dec 14, 2023 at 12:37 PM Dilip Kumar <[email protected]> wrote:
>
> I think you forgot to attach the patch.
Sorry. Here it is.
On Thu, Dec 14, 2023 at 2:36 PM Amit Kapila <[email protected]> wrote:
>
> >
> It looks like the solution works. But this is the only place where we
> process a change before SNAPSHOT reaches FULL. But this is also the
> only record which affects a decision to queue/not a following change.
> So it should be ok. The sequence_hash'es as separate for each
> transaction and they are cleaned when processing COMMIT record.
> >
>
> But it is possible that even commit or abort also happens before the
> snapshot reaches full state in which case the hash table will have
> stale or invalid (for aborts) entries. That will probably be cleaned
> at a later point by running_xact records.
Why would cleaning wait till running_xact records? Won't txn entry
itself be removed when processing commit/abort record? At the same the
sequence hash will be cleaned as well.
> Now, I think in theory, it
> is possible that the same RelFileLocator can again be allocated before
> we clean up the existing entry which can probably confuse the system.
How? The transaction allocating the first time would be cleaned before
it happens the second time. So shouldn't matter.
--
Best Wishes,
Ashutosh Bapat
diff --git a/src/backend/replication/logical/decode.c
b/src/backend/replication/logical/decode.c
index 73e38cafd8..8e2ebbd8e0 100644
--- a/src/backend/replication/logical/decode.c
+++ b/src/backend/replication/logical/decode.c
@@ -1543,10 +1543,15 @@ smgr_decode(LogicalDecodingContext *ctx,
XLogRecordBuffer *buf)
elog(LOG, "XXX: smgr_decode. snapshot is %s",
SnapBuildIdentify(builder));
/*
- * If we don't have snapshot or we are just fast-forwarding, there is no
- * point in decoding relfilenode information.
+ * If we are not building snapshot yet or we are just fast-forwarding,
there
+ * is no point in decoding relfilenode information. If the sequence
+ * associated with relfilenode here is changed in the same transaction
but
+ * after snapshot was built, the relfilenode needs to be present in the
+ * sequence hash table so that the change can be deemed as
transactional.
+ * Otherwise it will not be queued. Hence we process this change even
before
+ * we have built snapshot.
*/
- if (SnapBuildCurrentState(builder) < SNAPBUILD_FULL_SNAPSHOT ||
+ if (SnapBuildCurrentState(builder) < SNAPBUILD_BUILDING_SNAPSHOT ||
ctx->fast_forward)
{
elog(LOG, "XXX: skipped");