Hi,
I had a closer look at v3 of the patch now.
On Sat, Mar 03, 2018 at 07:23:31PM +0100, Magnus Hagander wrote:
> Attached is a rebased patch which removes this optimization, updates the
> pg_proc entry for the new format, and changes pg_verify_checksums to use -r
> instead of -o for relfilenode.
The patch applies fine with minimal fuzz and compiles with no warnings;
make check and the added isolation tests, as well as the added checksum
tests pass.
If I blindly run "SELECT pg_enable_data_checksums();" on new cluster, I
get:
|postgres=# SELECT pg_enable_data_checksums();
| pg_enable_data_checksums
|--------------------------
| t
|(1 row)
|
|postgres=# SHOW data_checksums ;
| data_checksums
|----------------
| inprogress
|(1 row)
However, inspecting the log one sees:
|2018-03-10 14:15:57.702 CET [3313] ERROR: Database template0 does not allow
connections.
|2018-03-10 14:15:57.702 CET [3313] HINT: Allow connections using ALTER
DATABASE and try again.
|2018-03-10 14:15:57.702 CET [3152] LOG: background worker "checksum helper
launcher" (PID 3313) exited with exit code 1
and the background worker is no longer running without any obvious hint
to the client.
I am aware that this is discussed already, but as-is the user experience
is pretty bad, I think pg_enable_data_checksums() should either bail
earlier if it cannot connect to all databases, or it should be better
documented.
Otherwise, if I allow connections to template0, the patch works as
expected, I have not managed to break it so far.
Some further review comments:
> diff --git a/doc/src/sgml/wal.sgml b/doc/src/sgml/wal.sgml
> index f4bc2d4161..89afecb341 100644
> --- a/doc/src/sgml/wal.sgml
> +++ b/doc/src/sgml/wal.sgml
> @@ -230,6 +230,73 @@
[...]
> + <para>
> + Checksums can be enabled or disabled online, by calling the appropriate
> + <link linkend="functions-admin-checksum">functions</link>.
> + Disabling of checksums takes effect immediately when the function is
> called.
> + </para>
> +
> + <para>
> + Enabling checksums will put the cluster in <literal>inprogress</literal>
> mode.
> + During this time, checksums will be written but not verified. In
> addition to
> + this, a background worker process is started that enables checksums on
> all
> + existing data in the cluster. Once this worker has completed processing
> all
> + databases in the cluster, the checksum mode will automatically switch to
> + <literal>on</literal>.
> + </para>
> +
> + <para>
> + If the cluster is stopped while in <literal>inprogress</literal> mode,
> for
> + any reason, then this process must be restarted manually. To do this,
> + re-execute the function <function>pg_enable_data_checksums()</function>
> + once the cluster has been restarted.
> + </para>
Maybe document the above issue here, unless it is clear that the
templat0-needs-to-allow-connections issue will go away before the patch
is pushed.
> diff --git a/src/backend/access/transam/xlog.c
> b/src/backend/access/transam/xlog.c
> index 47a6c4d895..56aaa88de1 100644
> --- a/src/backend/access/transam/xlog.c
> +++ b/src/backend/access/transam/xlog.c
[...]
> +void
> +SetDataChecksumsOn(void)
> +{
> + if (!DataChecksumsEnabledOrInProgress())
> + elog(ERROR, "Checksums not enabled or in progress");
> +
> + LWLockAcquire(ControlFileLock, LW_EXCLUSIVE);
> +
> + if (ControlFile->data_checksum_version !=
> PG_DATA_CHECKSUM_INPROGRESS_VERSION)
> + {
> + LWLockRelease(ControlFileLock);
> + elog(ERROR, "Checksums not in in_progress mode");
The string used in "SHOW data_checksums" is "inprogress", not
"in_progress".
[...]
> @@ -7769,6 +7839,16 @@ StartupXLOG(void)
> CompleteCommitTsInitialization();
>
> /*
> + * If we reach this point with checksums in inprogress state, we notify
> + * the user that they need to manually restart the process to enable
> + * checksums.
> + */
> + if (ControlFile->data_checksum_version ==
> PG_DATA_CHECKSUM_INPROGRESS_VERSION)
> + ereport(WARNING,
> + (errmsg("checksum state is \"in progress\" with
> no worker"),
> + errhint("Either disable or enable checksums by
> calling the pg_disable_data_checksums() or pg_enable_data_checksums()
> functions.")));
Again, string is "inprogress", not "in progress", not sure if that
matters.
> diff --git a/src/backend/postmaster/checksumhelper.c
> b/src/backend/postmaster/checksumhelper.c
> new file mode 100644
> index 0000000000..6aa71bcf30
> --- /dev/null
> +++ b/src/backend/postmaster/checksumhelper.c
> @@ -0,0 +1,631 @@
> +/*-------------------------------------------------------------------------
> + *
> + * checksumhelper.c
> + * Backend worker to walk the database and write checksums to pages
Backend or Background?
[...]
> +typedef struct ChecksumHelperShmemStruct
> +{
> + pg_atomic_flag launcher_started;
> + bool success;
> + bool process_shared_catalogs;
> + /* Parameter values set on start */
double space.
[...]
> +static bool
> +ProcessSingleRelationFork(Relation reln, ForkNumber forkNum,
> BufferAccessStrategy strategy)
> +{
> + BlockNumber numblocks = RelationGetNumberOfBlocksInFork(reln, forkNum);
> + BlockNumber b;
> +
> + for (b = 0; b < numblocks; b++)
> + {
> + Buffer buf = ReadBufferExtended(reln, forkNum, b,
> RBM_NORMAL, strategy);
> + Page page;
> + PageHeader pagehdr;
> + uint16 checksum;
> +
> + /* Need to get an exclusive lock before we can flag as dirty */
> + LockBuffer(buf, BUFFER_LOCK_EXCLUSIVE);
> +
> + /* Do we already have a valid checksum? */
> + page = BufferGetPage(buf);
> + pagehdr = (PageHeader) page;
> + checksum = pg_checksum_page((char *) page, b);
This looks like it does not take the segment number into account;
however it is also unclear to me what the purpose of this is, as
checksum is never validated against the pagehdr, and nothing is done
with it. Indeed, I even get a compiler warning about pagehdr and
checksum:
git/postgresql/build/../src/backend/postmaster/checksumhelper.c:
In function ‘ProcessSingleRelationFork’:
git/postgresql/build/../src/backend/postmaster/checksumhelper.c:155:11:
warning: variable ‘checksum’ set but not used
[-Wunused-but-set-variable]
uint16 checksum;
^~~~~~~~
git/postgresql/build/../src/backend/postmaster/checksumhelper.c:154:14:
warning: variable ‘pagehdr’ set but not used [-Wunused-but-set-variable]
PageHeader pagehdr;
^~~~~~~
I guess the above block running pg_checksum_page() is a leftover from
previous versions of the patch and should be removed...
> + /*
> + * Mark the buffer as dirty and force a full page write.
> + * We have to re-write the page to wal even if the checksum
> hasn't
> + * changed, because if there is a replica it might have a
> slightly
> + * different version of the page with an invalid checksum,
> caused
> + * by unlogged changes (e.g. hintbits) on the master happening
> while
> + * checksums were off. This can happen if there was a valid
> checksum
> + * on the page at one point in the past, so only when checksums
> + * are first on, then off, and then turned on again.
> + */
> + START_CRIT_SECTION();
> + MarkBufferDirty(buf);
> + log_newpage_buffer(buf, false);
> + END_CRIT_SECTION();
... seeing how MarkBufferDirty(buf) is now run unconditionally.
[...]
> + /*
> + * Initialize a connection to shared catalogs only.
> + */
> + BackgroundWorkerInitializeConnection(NULL, NULL);
> +
> + /*
> + * Set up so first run processes shared catalogs, but not once in every
> db
> + */
Comment should have a full stop like the above and below ones?
> + ChecksumHelperShmem->process_shared_catalogs = true;
> +
> + /*
> + * Create a database list. We don't need to concern ourselves with
> + * rebuilding this list during runtime since any new created database
> will
> + * be running with checksums turned on from the start.
> + */
[...]
> + DatabaseList = BuildDatabaseList();
> +
> + /*
> + * If there are no databases at all to checksum, we can exit immediately
> + * as there is no work to do.
> + */
> + if (DatabaseList == NIL || list_length(DatabaseList) == 0)
> + return;
> +
> + while (true)
> + {
> + List *remaining = NIL;
> + ListCell *lc,
> + *lc2;
> + List *CurrentDatabases = NIL;
> +
> + foreach(lc, DatabaseList)
> + {
> + ChecksumHelperDatabase *db = (ChecksumHelperDatabase *)
> lfirst(lc);
> +
> + if (ProcessDatabase(db))
> + {
> + pfree(db->dbname);
> + pfree(db);
> +
> + if
> (ChecksumHelperShmem->process_shared_catalogs)
> +
> + /*
> + * Now that one database has completed
> shared catalogs, we
> + * don't have to process them again .
Stray space.
> + */
> +
> ChecksumHelperShmem->process_shared_catalogs = false;
> + }
> + else
> + {
> + /*
> + * Put failed databases on the remaining list.
> + */
> + remaining = lappend(remaining, db);
> + }
> + }
> + list_free(DatabaseList);
> +
> + DatabaseList = remaining;
> + remaining = NIL;
> +
> + /*
> + * DatabaseList now has all databases not yet processed. This
> can be
> + * because they failed for some reason, or because the database
> was
> + * DROPed between us getting the database list and trying to
> process
DROPed looks wrong, and there's no other occurence of it in the source
tree. DROPped looks even weirder, so maybe just "dropped"?
> + * it. Get a fresh list of databases to detect the second case
> with.
That sentence looks unfinished or at least is unclear to me.
> + * Any database that still exists but failed we retry for a
> limited
I'm not a native speaker, but this looks wrong to me as well, maybe "We
retry any database that still exists but failed for a limited [...]"?
> diff --git a/src/bin/pg_upgrade/controldata.c
> b/src/bin/pg_upgrade/controldata.c
> index 0fe98a550e..4bb2b7e6ec 100644
> --- a/src/bin/pg_upgrade/controldata.c
> +++ b/src/bin/pg_upgrade/controldata.c
> @@ -591,6 +591,15 @@ check_control_data(ControlData *oldctrl,
> */
>
> /*
> + * If checksums have been turned on in the old cluster, but the
> + * checksumhelper have yet to finish, then disallow upgrading. The user
> + * should either let the process finish, or turn off checksums, before
> + * retrying.
> + */
> + if (oldctrl->data_checksum_version == 2)
> + pg_fatal("transition to data checksums not completed in old
> cluster\n");
> +
> + /*
> * We might eventually allow upgrades from checksum to no-checksum
> * clusters.
> */
See below about src/bin/pg_upgrade/pg_upgrade.h having
data_checksum_version be a bool.
I checked pg_ugprade (from master to master though), and could find no
off-hand issues, i.e. it reported all issues correctly.
> --- /dev/null
> +++ b/src/bin/pg_verify_checksums/Makefile
[...]
> +check:
> + $(prove_check)
> +
> +installcheck:
> + $(prove_installcheck)
If I run "make check" in src/bin/pg_verify_checksums, I get a fat perl
error:
|src/bin/pg_verify_checksums$ LANG=C make check
|rm -rf '/home/mba/Projekte/OSS/PostgreSQL/git/postgresql/build'/tmp_install
|/bin/mkdir -p
'/home/mba/Projekte/OSS/PostgreSQL/git/postgresql/build'/tmp_install/log
|make -C '../../..'
DESTDIR='/home/mba/Projekte/OSS/PostgreSQL/git/postgresql/build'/tmp_install
install
>'/home/mba/Projekte/OSS/PostgreSQL/git/postgresql/build'/tmp_install/log/install.log
2>&1
|rm -rf
'/home/mba/Projekte/OSS/PostgreSQL/git/postgresql/build/src/bin/pg_verify_checksums'/tmp_check
|/bin/mkdir -p
'/home/mba/Projekte/OSS/PostgreSQL/git/postgresql/build/src/bin/pg_verify_checksums'/tmp_check
|cd
/home/mba/Projekte/OSS/PostgreSQL/git/postgresql/build/../src/bin/pg_verify_checksums
&&
TESTDIR='/home/mba/Projekte/OSS/PostgreSQL/git/postgresql/build/src/bin/pg_verify_checksums'
PATH="/home/mba/Projekte/OSS/PostgreSQL/git/postgresql/build/tmp_install//bin:$PATH"
LD_LIBRARY_PATH="/home/mba/Projekte/OSS/PostgreSQL/git/postgresql/build/tmp_install//lib"
PGPORT='65432'
PG_REGRESS='/home/mba/Projekte/OSS/PostgreSQL/git/postgresql/build/src/bin/pg_verify_checksums/../../../src/test/regress/pg_regress'
/usr/bin/prove -I
/home/mba/Projekte/OSS/PostgreSQL/git/postgresql/build/../src/test/perl/ -I
/home/mba/Projekte/OSS/PostgreSQL/git/postgresql/build/../src/bin/pg_verify_checksums
t/*.pl
|Cannot detect source of 't/*.pl'! at
/usr/share/perl/5.24/TAP/Parser/IteratorFactory.pm line 261.
|
TAP::Parser::IteratorFactory::detect_source(TAP::Parser::IteratorFactory=HASH(0x55eed10df3e8),
TAP::Parser::Source=HASH(0x55eed10bd358)) called at
/usr/share/perl/5.24/TAP/Parser/IteratorFactory.pm line 211
|
TAP::Parser::IteratorFactory::make_iterator(TAP::Parser::IteratorFactory=HASH(0x55eed10df3e8),
TAP::Parser::Source=HASH(0x55eed10bd358)) called at
/usr/share/perl/5.24/TAP/Parser.pm line 472
| TAP::Parser::_initialize(TAP::Parser=HASH(0x55eed10df328),
HASH(0x55eed0ea2e90)) called at /usr/share/perl/5.24/TAP/Object.pm line 55
| TAP::Object::new("TAP::Parser", HASH(0x55eed0ea2e90)) called at
/usr/share/perl/5.24/TAP/Object.pm line 130
| TAP::Object::_construct(TAP::Harness=HASH(0x55eed09176b0),
"TAP::Parser", HASH(0x55eed0ea2e90)) called at
/usr/share/perl/5.24/TAP/Harness.pm line 852
| TAP::Harness::make_parser(TAP::Harness=HASH(0x55eed09176b0),
TAP::Parser::Scheduler::Job=HASH(0x55eed0fdc708)) called at
/usr/share/perl/5.24/TAP/Harness.pm line 651
| TAP::Harness::_aggregate_single(TAP::Harness=HASH(0x55eed09176b0),
TAP::Parser::Aggregator=HASH(0x55eed091e520),
TAP::Parser::Scheduler=HASH(0x55eed0fdc6a8)) called at
/usr/share/perl/5.24/TAP/Harness.pm line 743
| TAP::Harness::aggregate_tests(TAP::Harness=HASH(0x55eed09176b0),
TAP::Parser::Aggregator=HASH(0x55eed091e520), "t/*.pl") called at
/usr/share/perl/5.24/TAP/Harness.pm line 558
| TAP::Harness::__ANON__() called at /usr/share/perl/5.24/TAP/Harness.pm
line 571
| TAP::Harness::runtests(TAP::Harness=HASH(0x55eed09176b0), "t/*.pl")
called at /usr/share/perl/5.24/App/Prove.pm line 546
| App::Prove::_runtests(App::Prove=HASH(0x55eed090b0c8),
HASH(0x55eed0d79cf0), "t/*.pl") called at /usr/share/perl/5.24/App/Prove.pm
line 504
| App::Prove::run(App::Prove=HASH(0x55eed090b0c8)) called at
/usr/bin/prove line 13
|Makefile:39: recipe for target 'check' failed
|make: *** [check] Error 2
> diff --git a/src/bin/pg_verify_checksums/pg_verify_checksums.c
> b/src/bin/pg_verify_checksums/pg_verify_checksums.c
> new file mode 100644
> index 0000000000..a4bfe7284d
> --- /dev/null
> +++ b/src/bin/pg_verify_checksums/pg_verify_checksums.c
[...]
> + if (DataDir == NULL)
> + {
> + if (optind < argc)
> + DataDir = argv[optind++];
> + else
> + DataDir = getenv("PGDATA");
> + }
> +
> + /* Complain if any arguments remain */
> + if (optind < argc)
> + {
> + fprintf(stderr, _("%s: too many command-line arguments (first
> is \"%s\")\n"),
> + progname, argv[optind]);
> + fprintf(stderr, _("Try \"%s --help\" for more information.\n"),
> + progname);
> + exit(1);
> + }
> +
> + if (DataDir == NULL)
> + {
> + fprintf(stderr, _("%s: no data directory specified\n"),
> progname);
> + fprintf(stderr, _("Try \"%s --help\" for more information.\n"),
> progname);
> + exit(1);
> + }
Those two if (DataDir == NULL) checks could maybe be put together into
one block.
> diff --git a/src/include/postmaster/checksumhelper.h
> b/src/include/postmaster/checksumhelper.h
> new file mode 100644
> index 0000000000..7f296264a9
> --- /dev/null
> +++ b/src/include/postmaster/checksumhelper.h
> @@ -0,0 +1,31 @@
> +/*-------------------------------------------------------------------------
> + *
> + * checksumhelper.h
> + * header file for checksum helper deamon
"deamon" is surely wrong (it'd be "daemon"), but maybe "(background)
worker" is better?
> diff --git a/src/include/storage/bufpage.h b/src/include/storage/bufpage.h
> index 85dd10c45a..bd46bf2ce6 100644
> --- a/src/include/storage/bufpage.h
> +++ b/src/include/storage/bufpage.h
> @@ -194,6 +194,7 @@ typedef PageHeaderData *PageHeader;
> */
> #define PG_PAGE_LAYOUT_VERSION 4
> #define PG_DATA_CHECKSUM_VERSION 1
> +#define PG_DATA_CHECKSUM_INPROGRESS_VERSION 2
I am not very sure about the semantics of PG_DATA_CHECKSUM_VERSION being
1, but I assumed it was a version, like, if we ever decide to use a
different checksumming algorithm, we'd bump it to 2.
Now PG_DATA_CHECKSUM_INPROGRESS_VERSION is defined to 2, which I agree
is convenient, but is there some strategy what to do about this in case
the PG_DATA_CHECKSUM_VERSION needs to be increased?
In any case, src/bin/pg_upgrade/pg_upgrade.h has
bool data_checksum_version;
in the ControlData struct, which might need updating?
That's all for now.
Michael
--
Michael Banck
Projektleiter / Senior Berater
Tel.: +49 2166 9901-171
Fax: +49 2166 9901-100
Email: [email protected]
credativ GmbH, HRB Mönchengladbach 12080
USt-ID-Nummer: DE204566209
Trompeterallee 108, 41189 Mönchengladbach
Geschäftsführung: Dr. Michael Meskes, Jörg Folz, Sascha Heuer