On Fri, Sep 02, 2022 at 06:55:11AM -0500, Justin Pryzby wrote: > On Sun, Jun 13, 2021 at 08:24:12PM -0500, Justin Pryzby wrote: > > In this patch series, I added compression information to the errcontext from > > xlog_block_info(), and allow specifying compression levels like zlib-2. > > I'll > > rearrange that commit earlier if we decide that's desirable to include. > > 4035cd5d4 added wal_compress=lz4 and > e9537321a added wal_compress=zstd > > Since 4035cd5d4, pg_waldump has shown compression info (and walinspect > shows the same since 2258e76f9). > > But if you try to replay WAL on a server which wasn't compiled with > support for the requisite compression methods, it's a bit crummy that it > doesn't include in the error message the reason *why* it failed to > restore the image, if that's caused by the missing compression method.
That's also hitting an elog(). 2022-09-04 15:56:29.916 CDT startup[2625] FATAL: XX000: failed to restore block image 2022-09-04 15:56:29.916 CDT startup[2625] DETAIL: image at 0/1D11CB8 compressed with zstd not supported by build, block 0 2022-09-04 15:56:29.916 CDT startup[2625] CONTEXT: WAL redo at 0/1D11CB8 for Heap/DELETE: off 50 flags 0x00 KEYS_UPDATED ; blkref #0: rel 1663/16384/2610, blk 4 FPW 2022-09-04 15:56:29.916 CDT startup[2625] LOCATION: XLogReadBufferForRedoExtended, xlogutils.c:396 (gdb) bt #0 report_invalid_record (state=0x555555e33ff0, fmt=0x555555b1c1a8 "image at %X/%X compressed with %s not supported by build, block %d") at xlogreader.c:74 #1 0x00005555556beeec in RestoreBlockImage (record=record@entry=0x555555e33ff0, block_id=block_id@entry=0 '\000', page=page@entry=0x7fffee9bdc00 "") at xlogreader.c:2078 #2 0x00005555556c5d39 in XLogReadBufferForRedoExtended (record=record@entry=0x555555e33ff0, block_id=block_id@entry=0 '\000', mode=mode@entry=RBM_NORMAL, get_cleanup_lock=get_cleanup_lock@entry=false, buf=buf@entry=0x7fffffffd760) at xlogutils.c:395 #3 0x00005555556c5e4a in XLogReadBufferForRedo (record=record@entry=0x555555e33ff0, block_id=block_id@entry=0 '\000', buf=buf@entry=0x7fffffffd760) at xlogutils.c:320 #4 0x000055555565bd5b in heap_xlog_delete (record=0x555555e33ff0) at heapam.c:9032 #5 0x00005555556622b7 in heap_redo (record=<optimized out>) at heapam.c:9836 #6 0x00005555556c15ed in ApplyWalRecord (xlogreader=0x555555e33ff0, record=record@entry=0x7fffee2b6820, replayTLI=replayTLI@entry=0x7fffffffd870) at ../../../../src/include/access/xlog_internal.h:379 #7 0x00005555556c4c30 in PerformWalRecovery () at xlogrecovery.c:1725 #8 0x00005555556b7f23 in StartupXLOG () at xlog.c:5291 #9 0x0000555555ac4491 in InitPostgres (in_dbname=in_dbname@entry=0x555555e09390 "postgres", dboid=dboid@entry=0, username=username@entry=0x555555dedda0 "pryzbyj", useroid=useroid@entry=0, load_session_libraries=<optimized out>, override_allow_connections=override_allow_connections@entry=false, out_dbname=0x0) at postinit.c:731 #10 0x000055555598471f in PostgresMain (dbname=0x555555e09390 "postgres", username=username@entry=0x555555dedda0 "pryzbyj") at postgres.c:4085 #11 0x00005555559851b0 in PostgresSingleUserMain (argc=5, argv=0x555555de7530, username=0x555555dedda0 "pryzbyj") at postgres.c:3986 #12 0x0000555555840533 in main (argc=5, argv=0x555555de7530) at main.c:194 I guess it should be promoted to an ereport(), since it's now a user-facing error rathere than an internal one. $ ./tmp_install.without-zstd/usr/local/pgsql/bin/postgres -D ./src/test/regress/tmp_check/data 2022-09-04 15:28:37.446 CDT postmaster[29964] LOG: starting PostgreSQL 16devel on x86_64-pc-linux-gnu, compiled by gcc (Ubuntu 9.4.0-1ubuntu1~20.04.1) 9.4.0, 64-bit 2022-09-04 15:28:37.446 CDT postmaster[29964] LOG: listening on IPv4 address "127.0.0.1", port 5432 2022-09-04 15:28:37.528 CDT postmaster[29964] LOG: listening on Unix socket "/tmp/.s.PGSQL.5432" 2022-09-04 15:28:37.587 CDT startup[29972] LOG: database system was interrupted while in recovery at 2022-09-04 15:27:44 CDT 2022-09-04 15:28:37.587 CDT startup[29972] HINT: This probably means that some data is corrupted and you will have to use the last backup for recovery. 2022-09-04 15:28:38.010 CDT startup[29972] LOG: database system was not properly shut down; automatic recovery in progress 2022-09-04 15:28:38.038 CDT startup[29972] LOG: redo starts at 0/1D118C0 2022-09-04 15:28:38.039 CDT startup[29972] LOG: could not stat file "pg_tblspc/16502": No such file or directory 2022-09-04 15:28:38.039 CDT startup[29972] CONTEXT: WAL redo at 0/1D11970 for Tablespace/DROP: 16502 2022-09-04 15:28:38.039 CDT startup[29972] FATAL: failed to restore block image 2022-09-04 15:28:38.039 CDT startup[29972] DETAIL: image at 0/1D11CB8 compressed with zstd not supported by build, block 0 2022-09-04 15:28:38.039 CDT startup[29972] CONTEXT: WAL redo at 0/1D11CB8 for Heap/DELETE: off 50 flags 0x00 KEYS_UPDATED ; blkref #0: rel 1663/16384/2610, blk 4 FPW diff --git a/src/backend/access/transam/xlogutils.c b/src/backend/access/transam/xlogutils.c index 0cda22597fe..01c7454bcc7 100644 --- a/src/backend/access/transam/xlogutils.c +++ b/src/backend/access/transam/xlogutils.c @@ -393,7 +393,11 @@ XLogReadBufferForRedoExtended(XLogReaderState *record, prefetch_buffer); page = BufferGetPage(*buf); if (!RestoreBlockImage(record, block_id, page)) - elog(ERROR, "failed to restore block image"); + ereport(ERROR, + errcode(ERRCODE_FEATURE_NOT_SUPPORTED), + errmsg("failed to restore block image"), + errdetail("%s", record->errormsg_buf)); + /* * The page may be uninitialized. If so, we can't set the LSN because