Hi,
On Tue, Oct 08, 2024 at 04:25:29PM +1100, Peter Smith wrote:
> Hi, here are some review comments for patch v11.
Thanks for looking at it!
> ======
> contrib/pg_logicalinspect/specs/logical_inspect.spec
>
> 1.
> nit - Add some missing spaces after commas (,) in the SQL.
Fine by me, done in v12 attached.
> ======
> doc/src/sgml/pglogicalinspect.sgml
>
> 2.
> + <note>
> + <para>
> + The <filename>pg_logicalinspect</filename> functions are called
> + using a text argument that can be extracted from the output name of the
> + <function>pg_ls_logicalsnapdir</function>() function.
> + </para>
> + </note>
>
> 2a. wording
>
> The wording "using a text argument that can be extracted" seems like a
> hangover from the previous implementation; it does not even say what
> that "text argument" means.
That's right (it's mentioned later on (for each function description) that
the argument represents the snapshot file name though).
> Why not just say it is a snapshot
> filename, something like below?
>
> SUGGESTION:
> The pg_logicalinspect functions are called passing a snapshot filename
> to be inspected. For example, pass a name obtained from the
> pg_ls_logicalsnapdir() function.
Yeah, I like it, but...
> ~
>
> 2b. formatting
>
> nit - In the previous implementation the extraction of the LSN was
> trickier, so this part was worthy of an SGML "NOTE". Now that it is
> just a filename, I don't know if it needs to be a special note
> anymore.
In fact, giving it more thoughts, I think we can just remove this part.
I don't see the extra value anymore and that's something that we may need to
remove depending on what will be added to this module in the future.
I think that having the argument explanation in each function description is
enough, done that way in v12.
>
> ~~~
>
> 3.
> +postgres=# SELECT meta.* FROM pg_ls_logicalsnapdir(),
> +pg_get_logical_snapshot_meta(name) AS meta;
> +
> +-[ RECORD 1 ]--------
> +magic | 1369563137
> +checksum | 1028045905
> +version | 6
>
> 3a.
> If you are going to wrap the SQL across multiple lines like this, then
> you should show the psql continuation prompt, so that the example
> looks the same as what the user would see.
I'm not sure about this one. If the user copy/paste the doc as it is then there
is no psql continuation prompt. If the user does not copy/paste the doc then he
might indeed see "something" else (but that's not surprising since he did not
copy/paste). FWIW, there is similar examples in pgstatstatements.sgml.
> ~
>
> 3b.
> FYI, the output of that can return multiple records,
Yes, as the test in this patch does.
> which is
> b.i) probably not what you intended to demonstrate
> b.ii) not the same as what the example says
>
> e.g., I got this:
> test_pub=# SELECT meta.* FROM pg_ls_logicalsnapdir(),
> test_pub-# pg_get_logical_snapshot_meta(name) AS meta;
> -[ RECORD 1 ]--------
> magic | 1369563137
> checksum | 681884630
> version | 6
> -[ RECORD 2 ]--------
> magic | 1369563137
> checksum | 2213048308
> version | 6
> -[ RECORD 3 ]--------
> magic | 1369563137
> checksum | 3812680762
> version | 6
> -[ RECORD 4 ]--------
> magic | 1369563137
> checksum | 3759893001
> version | 6
>
I don't get the point here. The examples just show another way to use the
functions,
the ouput is more "anecdotal" than anything else.
>
> ~~~
>
> (Also those #3a, #3b comments apply to both examples)
>
> ======
> src/backend/replication/logical/snapbuild.c
>
> 4.
> - SnapBuild builder;
> -
> - /* variable amount of TransactionIds follows */
> -} SnapBuildOnDisk;
> -
> #define SnapBuildOnDiskConstantSize \
> offsetof(SnapBuildOnDisk, builder)
> #define SnapBuildOnDiskNotChecksummedSize \
>
> Is it better to try to keep those "Size" macros defined along with
> wherever the SnapBuildOnDisk is defined? Otherwise, if the structure
> is ever changed, adjusting the macros could be easily overlooked.
I think that the less we put in the snapbuild_internal.h the better. That said,
I think you have a good point so I added a comment around the SnapBuildOnDisk
definition instead in v12.
>
> ~~~
>
> 5.
> ValidateAndRestoreSnapshotFile
>
> nit - See [1] #4 suggestion to declare 'sz' at scope where used. The
> previous reason not to change this (e.g. "mainly inspired from
> SnapBuildRestore") seems less relevant because now most lines of this
> function have already been modified for other reasons.
Right. I think that's a matter of taste and I do prefer to "only" do the
necessary changes that are linked to the feature the patch is implementing.
> ~~~
>
> 6.
> SnapBuildRestore:
>
> + if (fd < 0 && errno == ENOENT)
> + return false;
> + else if (fd < 0)
> + ereport(ERROR,
> + (errcode_for_file_access(),
> + errmsg("could not open file \"%s\": %m", path)));
>
> I think this code fragment looked like this before, and you only
> relocated it,
That's right.
> but it still seems a bit awkward to write this way.
> Since so much else has changed, how about also improving this in
> passing, like below:
>
> if (fd < 0)
> {
> if (errno == ENOENT)
> return false;
>
> ereport(ERROR,
> (errcode_for_file_access(),
> errmsg("could not open file \"%s\": %m", path)));
> }
Same, I do prefer to "only" do the necessary changes that are linked to the
feature the patch is implementing (and why stop here, a similar change could be
made in logical/origin.c too for example).
Regards,
--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com
>From ea6e205761038924ddfe541498890d45f438aafc Mon Sep 17 00:00:00 2001
From: Bertrand Drouvot <[email protected]>
Date: Wed, 14 Aug 2024 08:46:05 +0000
Subject: [PATCH v12] Add contrib/pg_logicalinspect
Provides SQL functions that allow to inspect logical decoding components.
It currently allows to inspect the contents of serialized logical snapshots of
a running database cluster, which is useful for debugging or educational
purposes.
---
contrib/Makefile | 1 +
contrib/meson.build | 1 +
contrib/pg_logicalinspect/.gitignore | 4 +
contrib/pg_logicalinspect/Makefile | 31 ++
.../expected/logical_inspect.out | 52 ++++
contrib/pg_logicalinspect/logicalinspect.conf | 1 +
contrib/pg_logicalinspect/meson.build | 39 +++
.../pg_logicalinspect--1.0.sql | 43 +++
contrib/pg_logicalinspect/pg_logicalinspect.c | 199 +++++++++++++
.../pg_logicalinspect.control | 5 +
.../specs/logical_inspect.spec | 34 +++
doc/src/sgml/contrib.sgml | 1 +
doc/src/sgml/filelist.sgml | 1 +
doc/src/sgml/pglogicalinspect.sgml | 142 +++++++++
src/backend/replication/logical/snapbuild.c | 279 ++++--------------
src/include/replication/snapbuild.h | 6 +-
src/include/replication/snapbuild_internal.h | 199 +++++++++++++
17 files changed, 817 insertions(+), 221 deletions(-)
7.5% contrib/pg_logicalinspect/expected/
5.3% contrib/pg_logicalinspect/specs/
28.0% contrib/pg_logicalinspect/
13.5% doc/src/sgml/
25.1% src/backend/replication/logical/
20.2% src/include/replication/
diff --git a/contrib/Makefile b/contrib/Makefile
index abd780f277..952855d9b6 100644
--- a/contrib/Makefile
+++ b/contrib/Makefile
@@ -32,6 +32,7 @@ SUBDIRS = \
passwordcheck \
pg_buffercache \
pg_freespacemap \
+ pg_logicalinspect \
pg_prewarm \
pg_stat_statements \
pg_surgery \
diff --git a/contrib/meson.build b/contrib/meson.build
index 14a8906865..159ff41555 100644
--- a/contrib/meson.build
+++ b/contrib/meson.build
@@ -46,6 +46,7 @@ subdir('passwordcheck')
subdir('pg_buffercache')
subdir('pgcrypto')
subdir('pg_freespacemap')
+subdir('pg_logicalinspect')
subdir('pg_prewarm')
subdir('pgrowlocks')
subdir('pg_stat_statements')
diff --git a/contrib/pg_logicalinspect/.gitignore b/contrib/pg_logicalinspect/.gitignore
new file mode 100644
index 0000000000..5dcb3ff972
--- /dev/null
+++ b/contrib/pg_logicalinspect/.gitignore
@@ -0,0 +1,4 @@
+# Generated subdirectories
+/log/
+/results/
+/tmp_check/
diff --git a/contrib/pg_logicalinspect/Makefile b/contrib/pg_logicalinspect/Makefile
new file mode 100644
index 0000000000..55124514d4
--- /dev/null
+++ b/contrib/pg_logicalinspect/Makefile
@@ -0,0 +1,31 @@
+# contrib/pg_logicalinspect/Makefile
+
+MODULE_big = pg_logicalinspect
+OBJS = \
+ $(WIN32RES) \
+ pg_logicalinspect.o
+PGFILEDESC = "pg_logicalinspect - functions to inspect logical decoding components"
+
+EXTENSION = pg_logicalinspect
+DATA = pg_logicalinspect--1.0.sql
+
+EXTRA_INSTALL = contrib/test_decoding
+
+ISOLATION = logical_inspect
+
+ISOLATION_OPTS = --temp-config $(top_srcdir)/contrib/pg_logicalinspect/logicalinspect.conf
+
+# Disabled because these tests require "wal_level=logical", which
+# some installcheck users do not have (e.g. buildfarm clients).
+NO_INSTALLCHECK = 1
+
+ifdef USE_PGXS
+PG_CONFIG = pg_config
+PGXS := $(shell $(PG_CONFIG) --pgxs)
+include $(PGXS)
+else
+subdir = contrib/pg_logicalinspect
+top_builddir = ../..
+include $(top_builddir)/src/Makefile.global
+include $(top_srcdir)/contrib/contrib-global.mk
+endif
diff --git a/contrib/pg_logicalinspect/expected/logical_inspect.out b/contrib/pg_logicalinspect/expected/logical_inspect.out
new file mode 100644
index 0000000000..d95efa4d1e
--- /dev/null
+++ b/contrib/pg_logicalinspect/expected/logical_inspect.out
@@ -0,0 +1,52 @@
+Parsed test spec with 2 sessions
+
+starting permutation: s0_init s0_begin s0_savepoint s0_truncate s1_checkpoint s1_get_changes s0_commit s0_begin s0_insert s1_checkpoint s1_get_changes s0_commit s1_get_changes s1_get_logical_snapshot_info s1_get_logical_snapshot_meta
+step s0_init: SELECT 'init' FROM pg_create_logical_replication_slot('isolation_slot', 'test_decoding');
+?column?
+--------
+init
+(1 row)
+
+step s0_begin: BEGIN;
+step s0_savepoint: SAVEPOINT sp1;
+step s0_truncate: TRUNCATE tbl1;
+step s1_checkpoint: CHECKPOINT;
+step s1_get_changes: SELECT data FROM pg_logical_slot_get_changes('isolation_slot', NULL, NULL, 'skip-empty-xacts', '1', 'include-xids', '0');
+data
+----
+(0 rows)
+
+step s0_commit: COMMIT;
+step s0_begin: BEGIN;
+step s0_insert: INSERT INTO tbl1 VALUES (1);
+step s1_checkpoint: CHECKPOINT;
+step s1_get_changes: SELECT data FROM pg_logical_slot_get_changes('isolation_slot', NULL, NULL, 'skip-empty-xacts', '1', 'include-xids', '0');
+data
+---------------------------------------
+BEGIN
+table public.tbl1: TRUNCATE: (no-flags)
+COMMIT
+(3 rows)
+
+step s0_commit: COMMIT;
+step s1_get_changes: SELECT data FROM pg_logical_slot_get_changes('isolation_slot', NULL, NULL, 'skip-empty-xacts', '1', 'include-xids', '0');
+data
+-------------------------------------------------------------
+BEGIN
+table public.tbl1: INSERT: val1[integer]:1 val2[integer]:null
+COMMIT
+(3 rows)
+
+step s1_get_logical_snapshot_info: SELECT info.state, info.catchange_count, array_length(info.catchange_xip,1) AS catchange_array_length, info.committed_count, array_length(info.committed_xip,1) AS committed_array_length FROM pg_ls_logicalsnapdir(), pg_get_logical_snapshot_info(name) AS info ORDER BY 2;
+state |catchange_count|catchange_array_length|committed_count|committed_array_length
+----------+---------------+----------------------+---------------+----------------------
+consistent| 0| | 2| 2
+consistent| 2| 2| 0|
+(2 rows)
+
+step s1_get_logical_snapshot_meta: SELECT COUNT(meta.*) from pg_ls_logicalsnapdir(), pg_get_logical_snapshot_meta(name) as meta;
+count
+-----
+ 2
+(1 row)
+
diff --git a/contrib/pg_logicalinspect/logicalinspect.conf b/contrib/pg_logicalinspect/logicalinspect.conf
new file mode 100644
index 0000000000..e3d257315f
--- /dev/null
+++ b/contrib/pg_logicalinspect/logicalinspect.conf
@@ -0,0 +1 @@
+wal_level = logical
diff --git a/contrib/pg_logicalinspect/meson.build b/contrib/pg_logicalinspect/meson.build
new file mode 100644
index 0000000000..3ec635509b
--- /dev/null
+++ b/contrib/pg_logicalinspect/meson.build
@@ -0,0 +1,39 @@
+# Copyright (c) 2024, PostgreSQL Global Development Group
+
+pg_logicalinspect_sources = files('pg_logicalinspect.c')
+
+if host_system == 'windows'
+ pg_logicalinspect_sources += rc_lib_gen.process(win32ver_rc, extra_args: [
+ '--NAME', 'pg_logicalinspect',
+ '--FILEDESC', 'pg_logicalinspect - functions to inspect logical decoding components',])
+endif
+
+pg_logicalinspect = shared_module('pg_logicalinspect',
+ pg_logicalinspect_sources,
+ kwargs: contrib_mod_args + {
+ 'dependencies': contrib_mod_args['dependencies'],
+ },
+)
+contrib_targets += pg_logicalinspect
+
+install_data(
+ 'pg_logicalinspect.control',
+ 'pg_logicalinspect--1.0.sql',
+ kwargs: contrib_data_args,
+)
+
+tests += {
+ 'name': 'pg_logicalinspect',
+ 'sd': meson.current_source_dir(),
+ 'bd': meson.current_build_dir(),
+ 'isolation': {
+ 'specs': [
+ 'logical_inspect',
+ ],
+ 'regress_args': [
+ '--temp-config', files('logicalinspect.conf'),
+ ],
+ # see above
+ 'runningcheck': false,
+ },
+}
diff --git a/contrib/pg_logicalinspect/pg_logicalinspect--1.0.sql b/contrib/pg_logicalinspect/pg_logicalinspect--1.0.sql
new file mode 100644
index 0000000000..c773f6e458
--- /dev/null
+++ b/contrib/pg_logicalinspect/pg_logicalinspect--1.0.sql
@@ -0,0 +1,43 @@
+/* contrib/pg_logicalinspect/pg_logicalinspect--1.0.sql */
+
+-- complain if script is sourced in psql, rather than via CREATE EXTENSION
+\echo Use "CREATE EXTENSION pg_logicalinspect" to load this file. \quit
+
+--
+-- pg_get_logical_snapshot_meta()
+--
+CREATE FUNCTION pg_get_logical_snapshot_meta(IN filename text,
+ OUT magic int4,
+ OUT checksum int8,
+ OUT version int4
+)
+AS 'MODULE_PATHNAME', 'pg_get_logical_snapshot_meta'
+LANGUAGE C STRICT PARALLEL SAFE;
+
+REVOKE EXECUTE ON FUNCTION pg_get_logical_snapshot_meta(text) FROM PUBLIC;
+GRANT EXECUTE ON FUNCTION pg_get_logical_snapshot_meta(text) TO pg_read_server_files;
+
+--
+-- pg_get_logical_snapshot_info()
+--
+CREATE FUNCTION pg_get_logical_snapshot_info(IN filename text,
+ OUT state text,
+ OUT xmin xid,
+ OUT xmax xid,
+ OUT start_decoding_at pg_lsn,
+ OUT two_phase_at pg_lsn,
+ OUT initial_xmin_horizon xid,
+ OUT building_full_snapshot boolean,
+ OUT in_slot_creation boolean,
+ OUT last_serialized_snapshot pg_lsn,
+ OUT next_phase_at xid,
+ OUT committed_count int8,
+ OUT committed_xip xid[],
+ OUT catchange_count int8,
+ OUT catchange_xip xid[]
+)
+AS 'MODULE_PATHNAME', 'pg_get_logical_snapshot_info'
+LANGUAGE C STRICT PARALLEL SAFE;
+
+REVOKE EXECUTE ON FUNCTION pg_get_logical_snapshot_info(text) FROM PUBLIC;
+GRANT EXECUTE ON FUNCTION pg_get_logical_snapshot_info(text) TO pg_read_server_files;
diff --git a/contrib/pg_logicalinspect/pg_logicalinspect.c b/contrib/pg_logicalinspect/pg_logicalinspect.c
new file mode 100644
index 0000000000..0e3e1f50fc
--- /dev/null
+++ b/contrib/pg_logicalinspect/pg_logicalinspect.c
@@ -0,0 +1,199 @@
+/*-------------------------------------------------------------------------
+ *
+ * pg_logicalinspect.c
+ * Functions to inspect contents of PostgreSQL logical snapshots
+ *
+ * Copyright (c) 2024, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ * contrib/pg_logicalinspect/pg_logicalinspect.c
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "postgres.h"
+
+#include "funcapi.h"
+#include "replication/snapbuild_internal.h"
+#include "utils/array.h"
+#include "utils/builtins.h"
+#include "utils/pg_lsn.h"
+
+PG_MODULE_MAGIC;
+
+PG_FUNCTION_INFO_V1(pg_get_logical_snapshot_meta);
+PG_FUNCTION_INFO_V1(pg_get_logical_snapshot_info);
+
+/* Return the description of SnapBuildState */
+static const char *
+get_snapbuild_state_desc(SnapBuildState state)
+{
+ const char *stateDesc = "unknown state";
+
+ switch (state)
+ {
+ case SNAPBUILD_START:
+ stateDesc = "start";
+ break;
+ case SNAPBUILD_BUILDING_SNAPSHOT:
+ stateDesc = "building";
+ break;
+ case SNAPBUILD_FULL_SNAPSHOT:
+ stateDesc = "full";
+ break;
+ case SNAPBUILD_CONSISTENT:
+ stateDesc = "consistent";
+ break;
+ }
+
+ return stateDesc;
+}
+
+/*
+ * Retrieve the logical snapshot file metadata.
+ */
+Datum
+pg_get_logical_snapshot_meta(PG_FUNCTION_ARGS)
+{
+#define PG_GET_LOGICAL_SNAPSHOT_META_COLS 3
+ SnapBuildOnDisk ondisk;
+ HeapTuple tuple;
+ Datum values[PG_GET_LOGICAL_SNAPSHOT_META_COLS];
+ bool nulls[PG_GET_LOGICAL_SNAPSHOT_META_COLS];
+ TupleDesc tupdesc;
+ char path[MAXPGPATH];
+ MemoryContext context;
+ int fd;
+ int i = 0;
+ text *filename_t = PG_GETARG_TEXT_PP(0);
+
+ sprintf(path, "%s/%s",
+ PG_LOGICAL_SNAPSHOTS_DIR,
+ text_to_cstring(filename_t));
+
+ fd = OpenTransientFile(path, O_RDONLY | PG_BINARY);
+
+ if (fd < 0)
+ ereport(ERROR,
+ (errcode_for_file_access(),
+ errmsg("could not open file \"%s\": %m", path)));
+
+ context = AllocSetContextCreate(CurrentMemoryContext,
+ "logicalsnapshot inspect context",
+ ALLOCSET_DEFAULT_SIZES);
+
+ /* Validate and restore the snapshot to 'ondisk' */
+ ValidateAndRestoreSnapshotFile(&ondisk, path, fd, context);
+
+ /* Build a tuple descriptor for our result type */
+ if (get_call_result_type(fcinfo, NULL, &tupdesc) != TYPEFUNC_COMPOSITE)
+ elog(ERROR, "return type must be a row type");
+
+ memset(nulls, 0, sizeof(nulls));
+
+ values[i++] = UInt32GetDatum(ondisk.magic);
+ values[i++] = Int64GetDatum((int64) ondisk.checksum);
+ values[i++] = UInt32GetDatum(ondisk.version);
+
+ Assert(i == PG_GET_LOGICAL_SNAPSHOT_META_COLS);
+
+ tuple = heap_form_tuple(tupdesc, values, nulls);
+
+ MemoryContextReset(context);
+
+ PG_RETURN_DATUM(HeapTupleGetDatum(tuple));
+
+#undef PG_GET_LOGICAL_SNAPSHOT_META_COLS
+}
+
+Datum
+pg_get_logical_snapshot_info(PG_FUNCTION_ARGS)
+{
+#define PG_GET_LOGICAL_SNAPSHOT_INFO_COLS 14
+ SnapBuildOnDisk ondisk;
+ HeapTuple tuple;
+ Datum values[PG_GET_LOGICAL_SNAPSHOT_INFO_COLS];
+ bool nulls[PG_GET_LOGICAL_SNAPSHOT_INFO_COLS];
+ TupleDesc tupdesc;
+ char path[MAXPGPATH];
+ MemoryContext context;
+ int fd;
+ int i = 0;
+ text *filename_t = PG_GETARG_TEXT_PP(0);
+
+ sprintf(path, "%s/%s",
+ PG_LOGICAL_SNAPSHOTS_DIR,
+ text_to_cstring(filename_t));
+
+ fd = OpenTransientFile(path, O_RDONLY | PG_BINARY);
+
+ if (fd < 0)
+ ereport(ERROR,
+ (errcode_for_file_access(),
+ errmsg("could not open file \"%s\": %m", path)));
+
+ context = AllocSetContextCreate(CurrentMemoryContext,
+ "logicalsnapshot inspect context",
+ ALLOCSET_DEFAULT_SIZES);
+
+ /* Validate and restore the snapshot to 'ondisk' */
+ ValidateAndRestoreSnapshotFile(&ondisk, path, fd, context);
+
+ /* Build a tuple descriptor for our result type */
+ if (get_call_result_type(fcinfo, NULL, &tupdesc) != TYPEFUNC_COMPOSITE)
+ elog(ERROR, "return type must be a row type");
+
+ memset(nulls, 0, sizeof(nulls));
+
+ values[i++] = CStringGetTextDatum(get_snapbuild_state_desc(ondisk.builder.state));
+ values[i++] = TransactionIdGetDatum(ondisk.builder.xmin);
+ values[i++] = TransactionIdGetDatum(ondisk.builder.xmax);
+ values[i++] = LSNGetDatum(ondisk.builder.start_decoding_at);
+ values[i++] = LSNGetDatum(ondisk.builder.two_phase_at);
+ values[i++] = TransactionIdGetDatum(ondisk.builder.initial_xmin_horizon);
+ values[i++] = BoolGetDatum(ondisk.builder.building_full_snapshot);
+ values[i++] = BoolGetDatum(ondisk.builder.in_slot_creation);
+ values[i++] = LSNGetDatum(ondisk.builder.last_serialized_snapshot);
+ values[i++] = TransactionIdGetDatum(ondisk.builder.next_phase_at);
+
+ values[i++] = Int64GetDatum(ondisk.builder.committed.xcnt);
+ if (ondisk.builder.committed.xcnt > 0)
+ {
+ Datum *arrayelems;
+ int narrayelems = 0;
+
+ arrayelems = (Datum *) palloc(ondisk.builder.committed.xcnt * sizeof(Datum));
+
+ for (; narrayelems < ondisk.builder.committed.xcnt; narrayelems++)
+ arrayelems[narrayelems] = Int64GetDatum((int64) ondisk.builder.committed.xip[narrayelems]);
+
+ values[i++] = PointerGetDatum(construct_array_builtin(arrayelems, narrayelems, INT8OID));
+ }
+ else
+ nulls[i++] = true;
+
+ values[i++] = Int64GetDatum(ondisk.builder.catchange.xcnt);
+ if (ondisk.builder.catchange.xcnt > 0)
+ {
+ Datum *arrayelems;
+ int narrayelems = 0;
+
+ arrayelems = (Datum *) palloc(ondisk.builder.catchange.xcnt * sizeof(Datum));
+
+ for (; narrayelems < ondisk.builder.catchange.xcnt; narrayelems++)
+ arrayelems[narrayelems] = Int64GetDatum((int64) ondisk.builder.catchange.xip[narrayelems]);
+
+ values[i++] = PointerGetDatum(construct_array_builtin(arrayelems, narrayelems, INT8OID));
+ }
+ else
+ nulls[i++] = true;
+
+ Assert(i == PG_GET_LOGICAL_SNAPSHOT_INFO_COLS);
+
+ tuple = heap_form_tuple(tupdesc, values, nulls);
+
+ MemoryContextReset(context);
+
+ PG_RETURN_DATUM(HeapTupleGetDatum(tuple));
+
+#undef PG_GET_LOGICAL_SNAPSHOT_INFO_COLS
+}
diff --git a/contrib/pg_logicalinspect/pg_logicalinspect.control b/contrib/pg_logicalinspect/pg_logicalinspect.control
new file mode 100644
index 0000000000..b4a70e57ba
--- /dev/null
+++ b/contrib/pg_logicalinspect/pg_logicalinspect.control
@@ -0,0 +1,5 @@
+# pg_logicalinspect extension
+comment = 'functions to inspect logical decoding components'
+default_version = '1.0'
+module_pathname = '$libdir/pg_logicalinspect'
+relocatable = true
diff --git a/contrib/pg_logicalinspect/specs/logical_inspect.spec b/contrib/pg_logicalinspect/specs/logical_inspect.spec
new file mode 100644
index 0000000000..9851a6c18e
--- /dev/null
+++ b/contrib/pg_logicalinspect/specs/logical_inspect.spec
@@ -0,0 +1,34 @@
+# Test the pg_logicalinspect functions: that needs some permutation to
+# ensure that we are creating multiple logical snapshots and that one of them
+# contains ongoing catalogs changes.
+setup
+{
+ DROP TABLE IF EXISTS tbl1;
+ CREATE TABLE tbl1 (val1 integer, val2 integer);
+ CREATE EXTENSION pg_logicalinspect;
+}
+
+teardown
+{
+ DROP TABLE tbl1;
+ SELECT 'stop' FROM pg_drop_replication_slot('isolation_slot');
+ DROP EXTENSION pg_logicalinspect;
+}
+
+session "s0"
+setup { SET synchronous_commit=on; }
+step "s0_init" { SELECT 'init' FROM pg_create_logical_replication_slot('isolation_slot', 'test_decoding'); }
+step "s0_begin" { BEGIN; }
+step "s0_savepoint" { SAVEPOINT sp1; }
+step "s0_truncate" { TRUNCATE tbl1; }
+step "s0_insert" { INSERT INTO tbl1 VALUES (1); }
+step "s0_commit" { COMMIT; }
+
+session "s1"
+setup { SET synchronous_commit=on; }
+step "s1_checkpoint" { CHECKPOINT; }
+step "s1_get_changes" { SELECT data FROM pg_logical_slot_get_changes('isolation_slot', NULL, NULL, 'skip-empty-xacts', '1', 'include-xids', '0'); }
+step "s1_get_logical_snapshot_meta" { SELECT COUNT(meta.*) from pg_ls_logicalsnapdir(), pg_get_logical_snapshot_meta(name) as meta;}
+step "s1_get_logical_snapshot_info" { SELECT info.state, info.catchange_count, array_length(info.catchange_xip,1) AS catchange_array_length, info.committed_count, array_length(info.committed_xip,1) AS committed_array_length FROM pg_ls_logicalsnapdir(), pg_get_logical_snapshot_info(name) AS info ORDER BY 2; }
+
+permutation "s0_init" "s0_begin" "s0_savepoint" "s0_truncate" "s1_checkpoint" "s1_get_changes" "s0_commit" "s0_begin" "s0_insert" "s1_checkpoint" "s1_get_changes" "s0_commit" "s1_get_changes" "s1_get_logical_snapshot_info" "s1_get_logical_snapshot_meta"
diff --git a/doc/src/sgml/contrib.sgml b/doc/src/sgml/contrib.sgml
index 44639a8dca..7c381949a5 100644
--- a/doc/src/sgml/contrib.sgml
+++ b/doc/src/sgml/contrib.sgml
@@ -154,6 +154,7 @@ CREATE EXTENSION <replaceable>extension_name</replaceable>;
&pgbuffercache;
&pgcrypto;
&pgfreespacemap;
+ &pglogicalinspect;
&pgprewarm;
&pgrowlocks;
&pgstatstatements;
diff --git a/doc/src/sgml/filelist.sgml b/doc/src/sgml/filelist.sgml
index a7ff5f8264..66e6dccd4c 100644
--- a/doc/src/sgml/filelist.sgml
+++ b/doc/src/sgml/filelist.sgml
@@ -143,6 +143,7 @@
<!ENTITY pgbuffercache SYSTEM "pgbuffercache.sgml">
<!ENTITY pgcrypto SYSTEM "pgcrypto.sgml">
<!ENTITY pgfreespacemap SYSTEM "pgfreespacemap.sgml">
+<!ENTITY pglogicalinspect SYSTEM "pglogicalinspect.sgml">
<!ENTITY pgprewarm SYSTEM "pgprewarm.sgml">
<!ENTITY pgrowlocks SYSTEM "pgrowlocks.sgml">
<!ENTITY pgstatstatements SYSTEM "pgstatstatements.sgml">
diff --git a/doc/src/sgml/pglogicalinspect.sgml b/doc/src/sgml/pglogicalinspect.sgml
new file mode 100644
index 0000000000..e984979462
--- /dev/null
+++ b/doc/src/sgml/pglogicalinspect.sgml
@@ -0,0 +1,142 @@
+<!-- doc/src/sgml/pglogicalinspect.sgml -->
+
+<sect1 id="pglogicalinspect" xreflabel="pg_logicalinspect">
+ <title>pg_logicalinspect — logical decoding components inspection</title>
+
+ <indexterm zone="pglogicalinspect">
+ <primary>pg_logicalinspect</primary>
+ </indexterm>
+
+ <para>
+ The <filename>pg_logicalinspect</filename> module provides SQL functions
+ that allow you to inspect the contents of logical decoding components. It
+ allows the inspection of serialized logical snapshots of a running
+ <productname>PostgreSQL</productname> database cluster, which is useful
+ for debugging or educational purposes.
+ </para>
+
+ <para>
+ By default, use of these functions is restricted to superusers and members of
+ the <literal>pg_read_server_files</literal> role. Access may be granted by
+ superusers to others using <command>GRANT</command>.
+ </para>
+
+ <sect2 id="pglogicalinspect-funcs">
+ <title>General Functions</title>
+
+ <variablelist>
+ <varlistentry id="pglogicalinspect-funcs-pg-get-logical-snapshot-meta">
+ <term>
+ <function>pg_get_logical_snapshot_meta(filename text) returns record</function>
+ </term>
+
+ <listitem>
+ <para>
+ Gets logical snapshot metadata about a snapshot file that is located in
+ the server's <filename>pg_logical/snapshots</filename> directory.
+ The <replaceable>filename</replaceable> argument represents the snapshot
+ file name.
+ For example:
+<screen>
+postgres=# SELECT * FROM pg_ls_logicalsnapdir();
+-[ RECORD 1 ]+-----------------------
+name | 0-40796E18.snap
+size | 152
+modification | 2024-08-14 16:36:32+00
+
+postgres=# SELECT * FROM pg_get_logical_snapshot_meta('0-40796E18.snap');
+-[ RECORD 1 ]--------
+magic | 1369563137
+checksum | 1028045905
+version | 6
+
+postgres=# SELECT meta.* FROM pg_ls_logicalsnapdir(),
+pg_get_logical_snapshot_meta(name) AS meta;
+
+-[ RECORD 1 ]--------
+magic | 1369563137
+checksum | 1028045905
+version | 6
+</screen>
+ </para>
+ <para>
+ If <replaceable>filename</replaceable> does not match a snapshot file, the
+ function raises an error.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry id="pglogicalinspect-funcs-pg-get-logical-snapshot-info">
+ <term>
+ <function>pg_get_logical_snapshot_info(filename text) returns record</function>
+ </term>
+
+ <listitem>
+ <para>
+ Gets logical snapshot information about a snapshot file that is located in
+ the server's <filename>pg_logical/snapshots</filename> directory.
+ The <replaceable>filename</replaceable> argument represents the snapshot
+ file name.
+ For example:
+<screen>
+postgres=# SELECT * FROM pg_ls_logicalsnapdir();
+-[ RECORD 1 ]+-----------------------
+name | 0-40796E18.snap
+size | 152
+modification | 2024-08-14 16:36:32+00
+
+postgres=# SELECT * FROM pg_get_logical_snapshot_info('0-40796E18.snap');
+-[ RECORD 1 ]------------+-----------
+state | consistent
+xmin | 751
+xmax | 751
+start_decoding_at | 0/40796AF8
+two_phase_at | 0/40796AF8
+initial_xmin_horizon | 0
+building_full_snapshot | f
+in_slot_creation | f
+last_serialized_snapshot | 0/0
+next_phase_at | 0
+committed_count | 0
+committed_xip |
+catchange_count | 2
+catchange_xip | {751,752}
+
+postgres=# SELECT info.* FROM pg_ls_logicalsnapdir(),
+pg_get_logical_snapshot_info(name) AS info;
+-[ RECORD 1 ]------------+-----------
+state | consistent
+xmin | 751
+xmax | 751
+start_decoding_at | 0/40796AF8
+two_phase_at | 0/40796AF8
+initial_xmin_horizon | 0
+building_full_snapshot | f
+in_slot_creation | f
+last_serialized_snapshot | 0/0
+next_phase_at | 0
+committed_count | 0
+committed_xip |
+catchange_count | 2
+catchange_xip | {751,752}
+</screen>
+ </para>
+ <para>
+ If <replaceable>filename</replaceable> does not match a snapshot file, the
+ function raises an error.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ </variablelist>
+ </sect2>
+
+ <sect2 id="pglogicalinspect-author">
+ <title>Author</title>
+
+ <para>
+ Bertrand Drouvot <email>[email protected]</email>
+ </para>
+ </sect2>
+
+</sect1>
diff --git a/src/backend/replication/logical/snapbuild.c b/src/backend/replication/logical/snapbuild.c
index 0450f94ba8..7a3b963a2f 100644
--- a/src/backend/replication/logical/snapbuild.c
+++ b/src/backend/replication/logical/snapbuild.c
@@ -134,6 +134,7 @@
#include "replication/logical.h"
#include "replication/reorderbuffer.h"
#include "replication/snapbuild.h"
+#include "replication/snapbuild_internal.h"
#include "storage/fd.h"
#include "storage/lmgr.h"
#include "storage/proc.h"
@@ -143,146 +144,6 @@
#include "utils/memutils.h"
#include "utils/snapmgr.h"
#include "utils/snapshot.h"
-
-/*
- * This struct contains the current state of the snapshot building
- * machinery. Besides a forward declaration in the header, it is not exposed
- * to the public, so we can easily change its contents.
- */
-struct SnapBuild
-{
- /* how far are we along building our first full snapshot */
- SnapBuildState state;
-
- /* private memory context used to allocate memory for this module. */
- MemoryContext context;
-
- /* all transactions < than this have committed/aborted */
- TransactionId xmin;
-
- /* all transactions >= than this are uncommitted */
- TransactionId xmax;
-
- /*
- * Don't replay commits from an LSN < this LSN. This can be set externally
- * but it will also be advanced (never retreat) from within snapbuild.c.
- */
- XLogRecPtr start_decoding_at;
-
- /*
- * LSN at which two-phase decoding was enabled or LSN at which we found a
- * consistent point at the time of slot creation.
- *
- * The prepared transactions, that were skipped because previously
- * two-phase was not enabled or are not covered by initial snapshot, need
- * to be sent later along with commit prepared and they must be before
- * this point.
- */
- XLogRecPtr two_phase_at;
-
- /*
- * Don't start decoding WAL until the "xl_running_xacts" information
- * indicates there are no running xids with an xid smaller than this.
- */
- TransactionId initial_xmin_horizon;
-
- /* Indicates if we are building full snapshot or just catalog one. */
- bool building_full_snapshot;
-
- /*
- * Indicates if we are using the snapshot builder for the creation of a
- * logical replication slot. If it's true, the start point for decoding
- * changes is not determined yet. So we skip snapshot restores to properly
- * find the start point. See SnapBuildFindSnapshot() for details.
- */
- bool in_slot_creation;
-
- /*
- * Snapshot that's valid to see the catalog state seen at this moment.
- */
- Snapshot snapshot;
-
- /*
- * LSN of the last location we are sure a snapshot has been serialized to.
- */
- XLogRecPtr last_serialized_snapshot;
-
- /*
- * The reorderbuffer we need to update with usable snapshots et al.
- */
- ReorderBuffer *reorder;
-
- /*
- * TransactionId at which the next phase of initial snapshot building will
- * happen. InvalidTransactionId if not known (i.e. SNAPBUILD_START), or
- * when no next phase necessary (SNAPBUILD_CONSISTENT).
- */
- TransactionId next_phase_at;
-
- /*
- * Array of transactions which could have catalog changes that committed
- * between xmin and xmax.
- */
- struct
- {
- /* number of committed transactions */
- size_t xcnt;
-
- /* available space for committed transactions */
- size_t xcnt_space;
-
- /*
- * Until we reach a CONSISTENT state, we record commits of all
- * transactions, not just the catalog changing ones. Record when that
- * changes so we know we cannot export a snapshot safely anymore.
- */
- bool includes_all_transactions;
-
- /*
- * Array of committed transactions that have modified the catalog.
- *
- * As this array is frequently modified we do *not* keep it in
- * xidComparator order. Instead we sort the array when building &
- * distributing a snapshot.
- *
- * TODO: It's unclear whether that reasoning has much merit. Every
- * time we add something here after becoming consistent will also
- * require distributing a snapshot. Storing them sorted would
- * potentially also make it easier to purge (but more complicated wrt
- * wraparound?). Should be improved if sorting while building the
- * snapshot shows up in profiles.
- */
- TransactionId *xip;
- } committed;
-
- /*
- * Array of transactions and subtransactions that had modified catalogs
- * and were running when the snapshot was serialized.
- *
- * We normally rely on some WAL record types such as HEAP2_NEW_CID to know
- * if the transaction has changed the catalog. But it could happen that
- * the logical decoding decodes only the commit record of the transaction
- * after restoring the previously serialized snapshot in which case we
- * will miss adding the xid to the snapshot and end up looking at the
- * catalogs with the wrong snapshot.
- *
- * Now to avoid the above problem, we serialize the transactions that had
- * modified the catalogs and are still running at the time of snapshot
- * serialization. We fill this array while restoring the snapshot and then
- * refer it while decoding commit to ensure if the xact has modified the
- * catalog. We discard this array when all the xids in the list become old
- * enough to matter. See SnapBuildPurgeOlderTxn for details.
- */
- struct
- {
- /* number of transactions */
- size_t xcnt;
-
- /* This array must be sorted in xidComparator order */
- TransactionId *xip;
- } catchange;
-};
-
/*
* Starting a transaction -- which we need to do while exporting a snapshot --
* removes knowledge about the previously used resowner, so we save it here.
@@ -1557,40 +1418,6 @@ SnapBuildWaitSnapshot(xl_running_xacts *running, TransactionId cutoff)
}
}
-/* -----------------------------------
- * Snapshot serialization support
- * -----------------------------------
- */
-
-/*
- * We store current state of struct SnapBuild on disk in the following manner:
- *
- * struct SnapBuildOnDisk;
- * TransactionId * committed.xcnt; (*not xcnt_space*)
- * TransactionId * catchange.xcnt;
- *
- */
-typedef struct SnapBuildOnDisk
-{
- /* first part of this struct needs to be version independent */
-
- /* data not covered by checksum */
- uint32 magic;
- pg_crc32c checksum;
-
- /* data covered by checksum */
-
- /* version, in case we want to support pg_upgrade */
- uint32 version;
- /* how large is the on disk data, excluding the constant sized part */
- uint32 length;
-
- /* version dependent part */
- SnapBuild builder;
-
- /* variable amount of TransactionIds follows */
-} SnapBuildOnDisk;
-
#define SnapBuildOnDiskConstantSize \
offsetof(SnapBuildOnDisk, builder)
#define SnapBuildOnDiskNotChecksummedSize \
@@ -1857,34 +1684,14 @@ out:
}
/*
- * Restore a snapshot into 'builder' if previously one has been stored at the
- * location indicated by 'lsn'. Returns true if successful, false otherwise.
+ * Validate the logical snapshot file and read its contents to 'ondisk'.
*/
-static bool
-SnapBuildRestore(SnapBuild *builder, XLogRecPtr lsn)
+void
+ValidateAndRestoreSnapshotFile(SnapBuildOnDisk *ondisk, const char *path, int fd,
+ MemoryContext context)
{
- SnapBuildOnDisk ondisk;
- int fd;
- char path[MAXPGPATH];
- Size sz;
pg_crc32c checksum;
-
- /* no point in loading a snapshot if we're already there */
- if (builder->state == SNAPBUILD_CONSISTENT)
- return false;
-
- sprintf(path, "%s/%X-%X.snap",
- PG_LOGICAL_SNAPSHOTS_DIR,
- LSN_FORMAT_ARGS(lsn));
-
- fd = OpenTransientFile(path, O_RDONLY | PG_BINARY);
-
- if (fd < 0 && errno == ENOENT)
- return false;
- else if (fd < 0)
- ereport(ERROR,
- (errcode_for_file_access(),
- errmsg("could not open file \"%s\": %m", path)));
+ Size sz;
/* ----
* Make sure the snapshot had been stored safely to disk, that's normally
@@ -1897,47 +1704,46 @@ SnapBuildRestore(SnapBuild *builder, XLogRecPtr lsn)
fsync_fname(path, false);
fsync_fname(PG_LOGICAL_SNAPSHOTS_DIR, true);
-
/* read statically sized portion of snapshot */
- SnapBuildRestoreContents(fd, (char *) &ondisk, SnapBuildOnDiskConstantSize, path);
+ SnapBuildRestoreContents(fd, (char *) ondisk, SnapBuildOnDiskConstantSize, path);
- if (ondisk.magic != SNAPBUILD_MAGIC)
+ if (ondisk->magic != SNAPBUILD_MAGIC)
ereport(ERROR,
(errcode(ERRCODE_DATA_CORRUPTED),
errmsg("snapbuild state file \"%s\" has wrong magic number: %u instead of %u",
- path, ondisk.magic, SNAPBUILD_MAGIC)));
+ path, ondisk->magic, SNAPBUILD_MAGIC)));
- if (ondisk.version != SNAPBUILD_VERSION)
+ if (ondisk->version != SNAPBUILD_VERSION)
ereport(ERROR,
(errcode(ERRCODE_DATA_CORRUPTED),
errmsg("snapbuild state file \"%s\" has unsupported version: %u instead of %u",
- path, ondisk.version, SNAPBUILD_VERSION)));
+ path, ondisk->version, SNAPBUILD_VERSION)));
INIT_CRC32C(checksum);
COMP_CRC32C(checksum,
- ((char *) &ondisk) + SnapBuildOnDiskNotChecksummedSize,
+ ((char *) ondisk) + SnapBuildOnDiskNotChecksummedSize,
SnapBuildOnDiskConstantSize - SnapBuildOnDiskNotChecksummedSize);
/* read SnapBuild */
- SnapBuildRestoreContents(fd, (char *) &ondisk.builder, sizeof(SnapBuild), path);
- COMP_CRC32C(checksum, &ondisk.builder, sizeof(SnapBuild));
+ SnapBuildRestoreContents(fd, (char *) &ondisk->builder, sizeof(SnapBuild), path);
+ COMP_CRC32C(checksum, &ondisk->builder, sizeof(SnapBuild));
/* restore committed xacts information */
- if (ondisk.builder.committed.xcnt > 0)
+ if (ondisk->builder.committed.xcnt > 0)
{
- sz = sizeof(TransactionId) * ondisk.builder.committed.xcnt;
- ondisk.builder.committed.xip = MemoryContextAllocZero(builder->context, sz);
- SnapBuildRestoreContents(fd, (char *) ondisk.builder.committed.xip, sz, path);
- COMP_CRC32C(checksum, ondisk.builder.committed.xip, sz);
+ sz = sizeof(TransactionId) * ondisk->builder.committed.xcnt;
+ ondisk->builder.committed.xip = MemoryContextAllocZero(context, sz);
+ SnapBuildRestoreContents(fd, (char *) ondisk->builder.committed.xip, sz, path);
+ COMP_CRC32C(checksum, ondisk->builder.committed.xip, sz);
}
/* restore catalog modifying xacts information */
- if (ondisk.builder.catchange.xcnt > 0)
+ if (ondisk->builder.catchange.xcnt > 0)
{
- sz = sizeof(TransactionId) * ondisk.builder.catchange.xcnt;
- ondisk.builder.catchange.xip = MemoryContextAllocZero(builder->context, sz);
- SnapBuildRestoreContents(fd, (char *) ondisk.builder.catchange.xip, sz, path);
- COMP_CRC32C(checksum, ondisk.builder.catchange.xip, sz);
+ sz = sizeof(TransactionId) * ondisk->builder.catchange.xcnt;
+ ondisk->builder.catchange.xip = MemoryContextAllocZero(context, sz);
+ SnapBuildRestoreContents(fd, (char *) ondisk->builder.catchange.xip, sz, path);
+ COMP_CRC32C(checksum, ondisk->builder.catchange.xip, sz);
}
if (CloseTransientFile(fd) != 0)
@@ -1948,11 +1754,44 @@ SnapBuildRestore(SnapBuild *builder, XLogRecPtr lsn)
FIN_CRC32C(checksum);
/* verify checksum of what we've read */
- if (!EQ_CRC32C(checksum, ondisk.checksum))
+ if (!EQ_CRC32C(checksum, ondisk->checksum))
ereport(ERROR,
(errcode(ERRCODE_DATA_CORRUPTED),
errmsg("checksum mismatch for snapbuild state file \"%s\": is %u, should be %u",
- path, checksum, ondisk.checksum)));
+ path, checksum, ondisk->checksum)));
+}
+
+/*
+ * Restore a snapshot into 'builder' if previously one has been stored at the
+ * location indicated by 'lsn'. Returns true if successful, false otherwise.
+ */
+static bool
+SnapBuildRestore(SnapBuild *builder, XLogRecPtr lsn)
+{
+ SnapBuildOnDisk ondisk;
+ int fd;
+ char path[MAXPGPATH];
+
+ /* no point in loading a snapshot if we're already there */
+ if (builder->state == SNAPBUILD_CONSISTENT)
+ return false;
+
+ sprintf(path, "%s/%X-%X.snap",
+ PG_LOGICAL_SNAPSHOTS_DIR,
+ LSN_FORMAT_ARGS(lsn));
+
+ fd = OpenTransientFile(path, O_RDONLY | PG_BINARY);
+
+ if (fd < 0 && errno == ENOENT)
+ return false;
+ else if (fd < 0)
+ ereport(ERROR,
+ (errcode_for_file_access(),
+ errmsg("could not open file \"%s\": %m", path)));
+
+
+ /* validate and restore the snapshot to 'ondisk' */
+ ValidateAndRestoreSnapshotFile(&ondisk, path, fd, builder->context);
/*
* ok, we now have a sensible snapshot here, figure out if it has more
diff --git a/src/include/replication/snapbuild.h b/src/include/replication/snapbuild.h
index caa5113ff8..3c1454df99 100644
--- a/src/include/replication/snapbuild.h
+++ b/src/include/replication/snapbuild.h
@@ -15,6 +15,10 @@
#include "access/xlogdefs.h"
#include "utils/snapmgr.h"
+/*
+ * Please keep get_snapbuild_state_desc() (located in the pg_logicalinspect
+ * module) updated if a change needs to be made to SnapBuildState.
+ */
typedef enum
{
/*
@@ -46,7 +50,7 @@ typedef enum
SNAPBUILD_CONSISTENT = 2,
} SnapBuildState;
-/* forward declare so we don't have to expose the struct to the public */
+/* forward declare so we don't have to include snapbuild_internal.h */
struct SnapBuild;
typedef struct SnapBuild SnapBuild;
diff --git a/src/include/replication/snapbuild_internal.h b/src/include/replication/snapbuild_internal.h
new file mode 100644
index 0000000000..4791a90991
--- /dev/null
+++ b/src/include/replication/snapbuild_internal.h
@@ -0,0 +1,199 @@
+/*-------------------------------------------------------------------------
+ *
+ * snapbuild_internal.h
+ * This file contains declarations for logical decoding utility
+ * functions for internal use.
+ *
+ * Copyright (c) 2024, PostgreSQL Global Development Group
+ *
+ * src/include/replication/snapbuild_internal.h
+ *
+ *-------------------------------------------------------------------------
+ */
+
+#ifndef SNAPBUILD_INTERNAL_H
+#define SNAPBUILD_INTERNAL_H
+
+#include "port/pg_crc32c.h"
+#include "replication/reorderbuffer.h"
+#include "replication/snapbuild.h"
+
+/*
+ * This struct contains the current state of the snapshot building
+ * machinery. It is exposed to the public, so pay attention when changing its
+ * contents.
+ */
+typedef struct SnapBuild
+{
+ /* how far are we along building our first full snapshot */
+ SnapBuildState state;
+
+ /* private memory context used to allocate memory for this module. */
+ MemoryContext context;
+
+ /* all transactions < than this have committed/aborted */
+ TransactionId xmin;
+
+ /* all transactions >= than this are uncommitted */
+ TransactionId xmax;
+
+ /*
+ * Don't replay commits from an LSN < this LSN. This can be set externally
+ * but it will also be advanced (never retreat) from within snapbuild.c.
+ */
+ XLogRecPtr start_decoding_at;
+
+ /*
+ * LSN at which two-phase decoding was enabled or LSN at which we found a
+ * consistent point at the time of slot creation.
+ *
+ * The prepared transactions, that were skipped because previously
+ * two-phase was not enabled or are not covered by initial snapshot, need
+ * to be sent later along with commit prepared and they must be before
+ * this point.
+ */
+ XLogRecPtr two_phase_at;
+
+ /*
+ * Don't start decoding WAL until the "xl_running_xacts" information
+ * indicates there are no running xids with an xid smaller than this.
+ */
+ TransactionId initial_xmin_horizon;
+
+ /* Indicates if we are building full snapshot or just catalog one. */
+ bool building_full_snapshot;
+
+ /*
+ * Indicates if we are using the snapshot builder for the creation of a
+ * logical replication slot. If it's true, the start point for decoding
+ * changes is not determined yet. So we skip snapshot restores to properly
+ * find the start point. See SnapBuildFindSnapshot() for details.
+ */
+ bool in_slot_creation;
+
+ /*
+ * Snapshot that's valid to see the catalog state seen at this moment.
+ */
+ Snapshot snapshot;
+
+ /*
+ * LSN of the last location we are sure a snapshot has been serialized to.
+ */
+ XLogRecPtr last_serialized_snapshot;
+
+ /*
+ * The reorderbuffer we need to update with usable snapshots et al.
+ */
+ ReorderBuffer *reorder;
+
+ /*
+ * TransactionId at which the next phase of initial snapshot building will
+ * happen. InvalidTransactionId if not known (i.e. SNAPBUILD_START), or
+ * when no next phase necessary (SNAPBUILD_CONSISTENT).
+ */
+ TransactionId next_phase_at;
+
+ /*
+ * Array of transactions which could have catalog changes that committed
+ * between xmin and xmax.
+ */
+ struct
+ {
+ /* number of committed transactions */
+ size_t xcnt;
+
+ /* available space for committed transactions */
+ size_t xcnt_space;
+
+ /*
+ * Until we reach a CONSISTENT state, we record commits of all
+ * transactions, not just the catalog changing ones. Record when that
+ * changes so we know we cannot export a snapshot safely anymore.
+ */
+ bool includes_all_transactions;
+
+ /*
+ * Array of committed transactions that have modified the catalog.
+ *
+ * As this array is frequently modified we do *not* keep it in
+ * xidComparator order. Instead we sort the array when building &
+ * distributing a snapshot.
+ *
+ * TODO: It's unclear whether that reasoning has much merit. Every
+ * time we add something here after becoming consistent will also
+ * require distributing a snapshot. Storing them sorted would
+ * potentially also make it easier to purge (but more complicated wrt
+ * wraparound?). Should be improved if sorting while building the
+ * snapshot shows up in profiles.
+ */
+ TransactionId *xip;
+ } committed;
+
+ /*
+ * Array of transactions and subtransactions that had modified catalogs
+ * and were running when the snapshot was serialized.
+ *
+ * We normally rely on some WAL record types such as HEAP2_NEW_CID to know
+ * if the transaction has changed the catalog. But it could happen that
+ * the logical decoding decodes only the commit record of the transaction
+ * after restoring the previously serialized snapshot in which case we
+ * will miss adding the xid to the snapshot and end up looking at the
+ * catalogs with the wrong snapshot.
+ *
+ * Now to avoid the above problem, we serialize the transactions that had
+ * modified the catalogs and are still running at the time of snapshot
+ * serialization. We fill this array while restoring the snapshot and then
+ * refer it while decoding commit to ensure if the xact has modified the
+ * catalog. We discard this array when all the xids in the list become old
+ * enough to matter. See SnapBuildPurgeOlderTxn for details.
+ */
+ struct
+ {
+ /* number of transactions */
+ size_t xcnt;
+
+ /* This array must be sorted in xidComparator order */
+ TransactionId *xip;
+ } catchange;
+} SnapBuild;
+
+/* -----------------------------------
+ * Snapshot serialization support
+ * -----------------------------------
+ */
+
+/*
+ * We store current state of struct SnapBuild on disk in the following manner:
+ *
+ * struct SnapBuildOnDisk;
+ * TransactionId * committed.xcnt; (*not xcnt_space*)
+ * TransactionId * catchange.xcnt;
+ *
+ * Check if the SnapBuildOnDiskConstantSize and SnapBuildOnDiskNotChecksummedSize
+ * macros need to be updated when modifying the SnapBuildOnDisk struct.
+ */
+typedef struct SnapBuildOnDisk
+{
+ /* first part of this struct needs to be version independent */
+
+ /* data not covered by checksum */
+ uint32 magic;
+ pg_crc32c checksum;
+
+ /* data covered by checksum */
+
+ /* version, in case we want to support pg_upgrade */
+ uint32 version;
+ /* how large is the on disk data, excluding the constant sized part */
+ uint32 length;
+
+ /* version dependent part */
+ SnapBuild builder;
+
+ /* variable amount of TransactionIds follows */
+} SnapBuildOnDisk;
+
+extern void ValidateAndRestoreSnapshotFile(SnapBuildOnDisk *ondisk, const char *path,
+ int fd, MemoryContext context);
+
+#endif /* SNAPBUILD_INTERNAL_H */
--
2.34.1