Re: Improve comment on cid mapping (was Re: Adding CommandID to heap xlog records)

2023-06-26 Thread Andres Freund
Hi,

On 2023-06-26 09:57:56 +0300, Heikki Linnakangas wrote:
> diff --git a/src/backend/replication/logical/snapbuild.c 
> b/src/backend/replication/logical/snapbuild.c
> index 0786bb0ab7..e403feeccd 100644
> --- a/src/backend/replication/logical/snapbuild.c
> +++ b/src/backend/replication/logical/snapbuild.c
> @@ -41,10 +41,15 @@
>   * transactions we need Snapshots that see intermediate versions of the
>   * catalog in a transaction. During normal operation this is achieved by 
> using
>   * CommandIds/cmin/cmax. The problem with that however is that for space
> - * efficiency reasons only one value of that is stored
> - * (cf. combocid.c). Since combo CIDs are only available in memory we log
> - * additional information which allows us to get the original (cmin, cmax)
> - * pair during visibility checks. Check the reorderbuffer.c's comment above
> + * efficiency reasons, the cmin and cmax are not included in WAL records. We
> + * cannot read the cmin/cmax from the tuple itself, either, because it is
> + * reset on crash recovery. Even if we could, we could not decode combocids
> + * which are only tracked in the original backend's memory. To work around
> + * that, heapam writes an extra WAL record (XLOG_HEAP2_NEW_CID) every time a
> + * catalog row is modified, which includes the cmin and cmax of the
> + * tuple. During decoding, we insert the ctid->(cmin,cmax) mappings into the
> + * reorder buffer, and use them at visibility checks instead of the cmin/cmax
> + * on the tuple itself. Check the reorderbuffer.c's comment above
>   * ResolveCminCmaxDuringDecoding() for details.
>   *
>   * To facilitate all this we need our own visibility routine, as the normal
> -- 
> 2.30.2

LGTM


> From 9140a0d98fd21b595eac6d75521a6b1a9f1b Mon Sep 17 00:00:00 2001
> From: Heikki Linnakangas 
> Date: Mon, 26 Jun 2023 09:56:02 +0300
> Subject: [PATCH v2 2/2] Remove redundant check for fast_forward.
> 
> We already checked for it earlier in the function.
> 
> Discussion: 
> https://www.postgresql.org/message-id/1ba2899e-77f8-7866-79e5-f3b7d1251...@iki.fi
> ---
>  src/backend/replication/logical/decode.c | 3 +--
>  1 file changed, 1 insertion(+), 2 deletions(-)
> 
> diff --git a/src/backend/replication/logical/decode.c 
> b/src/backend/replication/logical/decode.c
> index d91055a440..7039d425e2 100644
> --- a/src/backend/replication/logical/decode.c
> +++ b/src/backend/replication/logical/decode.c
> @@ -422,8 +422,7 @@ heap2_decode(LogicalDecodingContext *ctx, 
> XLogRecordBuffer *buf)
>   switch (info)
>   {
>   case XLOG_HEAP2_MULTI_INSERT:
> - if (!ctx->fast_forward &&
> - SnapBuildProcessChange(builder, xid, 
> buf->origptr))
> + if (SnapBuildProcessChange(builder, xid, buf->origptr))
>   DecodeMultiInsert(ctx, buf);
>   break;
>   case XLOG_HEAP2_NEW_CID:
> -- 
> 2.30.2

LGTM^2

Greetings,

Andres Freund




Improve comment on cid mapping (was Re: Adding CommandID to heap xlog records)

2023-06-26 Thread Heikki Linnakangas

On 28/02/2023 15:52, Heikki Linnakangas wrote:

So unfortunately I don't see much opportunity to simplify logical
decoding with this. However, please take a look at the first two patches
attached. They're tiny cleanups that make sense on their own.


Rebased these small patches. I'll add this to the commitfest.

--
Heikki Linnakangas
Neon (https://neon.tech)
From 66289440ac65ea386cd138aeeed27b0032c2bb80 Mon Sep 17 00:00:00 2001
From: Heikki Linnakangas 
Date: Mon, 26 Jun 2023 09:48:26 +0300
Subject: [PATCH v2 1/2] Improve comment on why we need ctid->(cmin,cmax)
 mapping.

Combocids are only part of the problem. Explain the problem in more detail.

Discussion: https://www.postgresql.org/message-id/1ba2899e-77f8-7866-79e5-f3b7d1251...@iki.fi
---
 src/backend/replication/logical/snapbuild.c | 13 +
 1 file changed, 9 insertions(+), 4 deletions(-)

diff --git a/src/backend/replication/logical/snapbuild.c b/src/backend/replication/logical/snapbuild.c
index 0786bb0ab7..e403feeccd 100644
--- a/src/backend/replication/logical/snapbuild.c
+++ b/src/backend/replication/logical/snapbuild.c
@@ -41,10 +41,15 @@
  * transactions we need Snapshots that see intermediate versions of the
  * catalog in a transaction. During normal operation this is achieved by using
  * CommandIds/cmin/cmax. The problem with that however is that for space
- * efficiency reasons only one value of that is stored
- * (cf. combocid.c). Since combo CIDs are only available in memory we log
- * additional information which allows us to get the original (cmin, cmax)
- * pair during visibility checks. Check the reorderbuffer.c's comment above
+ * efficiency reasons, the cmin and cmax are not included in WAL records. We
+ * cannot read the cmin/cmax from the tuple itself, either, because it is
+ * reset on crash recovery. Even if we could, we could not decode combocids
+ * which are only tracked in the original backend's memory. To work around
+ * that, heapam writes an extra WAL record (XLOG_HEAP2_NEW_CID) every time a
+ * catalog row is modified, which includes the cmin and cmax of the
+ * tuple. During decoding, we insert the ctid->(cmin,cmax) mappings into the
+ * reorder buffer, and use them at visibility checks instead of the cmin/cmax
+ * on the tuple itself. Check the reorderbuffer.c's comment above
  * ResolveCminCmaxDuringDecoding() for details.
  *
  * To facilitate all this we need our own visibility routine, as the normal
-- 
2.30.2

From 9140a0d98fd21b595eac6d75521a6b1a9f1b Mon Sep 17 00:00:00 2001
From: Heikki Linnakangas 
Date: Mon, 26 Jun 2023 09:56:02 +0300
Subject: [PATCH v2 2/2] Remove redundant check for fast_forward.

We already checked for it earlier in the function.

Discussion: https://www.postgresql.org/message-id/1ba2899e-77f8-7866-79e5-f3b7d1251...@iki.fi
---
 src/backend/replication/logical/decode.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/src/backend/replication/logical/decode.c b/src/backend/replication/logical/decode.c
index d91055a440..7039d425e2 100644
--- a/src/backend/replication/logical/decode.c
+++ b/src/backend/replication/logical/decode.c
@@ -422,8 +422,7 @@ heap2_decode(LogicalDecodingContext *ctx, XLogRecordBuffer *buf)
 	switch (info)
 	{
 		case XLOG_HEAP2_MULTI_INSERT:
-			if (!ctx->fast_forward &&
-SnapBuildProcessChange(builder, xid, buf->origptr))
+			if (SnapBuildProcessChange(builder, xid, buf->origptr))
 DecodeMultiInsert(ctx, buf);
 			break;
 		case XLOG_HEAP2_NEW_CID:
-- 
2.30.2