Re: Different compression methods for FPI

Justin Pryzby Sun, 04 Sep 2022 17:23:42 -0700

On Fri, Sep 02, 2022 at 06:55:11AM -0500, Justin Pryzby wrote:
> On Sun, Jun 13, 2021 at 08:24:12PM -0500, Justin Pryzby wrote:
> > In this patch series, I added compression information to the errcontext from
> > xlog_block_info(), and allow specifying compression levels like zlib-2.  
> > I'll
> > rearrange that commit earlier if we decide that's desirable to include.
> 
> 4035cd5d4 added wal_compress=lz4 and
> e9537321a added wal_compress=zstd
> 
> Since 4035cd5d4, pg_waldump has shown compression info (and walinspect
> shows the same since 2258e76f9).
> 
> But if you try to replay WAL on a server which wasn't compiled with
> support for the requisite compression methods, it's a bit crummy that it
> doesn't include in the error message the reason *why* it failed to
> restore the image, if that's caused by the missing compression method.


That's also hitting an elog().

2022-09-04 15:56:29.916 CDT startup[2625] FATAL:  XX000: failed to restore 
block image
2022-09-04 15:56:29.916 CDT startup[2625] DETAIL:  image at 0/1D11CB8 
compressed with zstd not supported by build, block 0
2022-09-04 15:56:29.916 CDT startup[2625] CONTEXT:  WAL redo at 0/1D11CB8 for 
Heap/DELETE: off 50 flags 0x00 KEYS_UPDATED ; blkref #0: rel 1663/16384/2610, 
blk 4 FPW
2022-09-04 15:56:29.916 CDT startup[2625] LOCATION:  
XLogReadBufferForRedoExtended, xlogutils.c:396

(gdb) bt
#0  report_invalid_record (state=0x555555e33ff0, fmt=0x555555b1c1a8 "image at 
%X/%X compressed with %s not supported by build, block %d") at xlogreader.c:74
#1  0x00005555556beeec in RestoreBlockImage 
(record=record@entry=0x555555e33ff0, block_id=block_id@entry=0 '\000', 
page=page@entry=0x7fffee9bdc00 "") at xlogreader.c:2078
#2  0x00005555556c5d39 in XLogReadBufferForRedoExtended 
(record=record@entry=0x555555e33ff0, block_id=block_id@entry=0 '\000', 
mode=mode@entry=RBM_NORMAL, get_cleanup_lock=get_cleanup_lock@entry=false, 
    buf=buf@entry=0x7fffffffd760) at xlogutils.c:395
#3  0x00005555556c5e4a in XLogReadBufferForRedo 
(record=record@entry=0x555555e33ff0, block_id=block_id@entry=0 '\000', 
buf=buf@entry=0x7fffffffd760) at xlogutils.c:320
#4  0x000055555565bd5b in heap_xlog_delete (record=0x555555e33ff0) at 
heapam.c:9032
#5  0x00005555556622b7 in heap_redo (record=<optimized out>) at heapam.c:9836
#6  0x00005555556c15ed in ApplyWalRecord (xlogreader=0x555555e33ff0, 
record=record@entry=0x7fffee2b6820, replayTLI=replayTLI@entry=0x7fffffffd870) 
at ../../../../src/include/access/xlog_internal.h:379
#7  0x00005555556c4c30 in PerformWalRecovery () at xlogrecovery.c:1725
#8  0x00005555556b7f23 in StartupXLOG () at xlog.c:5291
#9  0x0000555555ac4491 in InitPostgres 
(in_dbname=in_dbname@entry=0x555555e09390 "postgres", dboid=dboid@entry=0, 
username=username@entry=0x555555dedda0 "pryzbyj", useroid=useroid@entry=0, 
    load_session_libraries=<optimized out>, 
override_allow_connections=override_allow_connections@entry=false, 
out_dbname=0x0) at postinit.c:731
#10 0x000055555598471f in PostgresMain (dbname=0x555555e09390 "postgres", 
username=username@entry=0x555555dedda0 "pryzbyj") at postgres.c:4085
#11 0x00005555559851b0 in PostgresSingleUserMain (argc=5, argv=0x555555de7530, 
username=0x555555dedda0 "pryzbyj") at postgres.c:3986
#12 0x0000555555840533 in main (argc=5, argv=0x555555de7530) at main.c:194

I guess it should be promoted to an ereport(), since it's now a
user-facing error rathere than an internal one.

$ ./tmp_install.without-zstd/usr/local/pgsql/bin/postgres -D 
./src/test/regress/tmp_check/data
2022-09-04 15:28:37.446 CDT postmaster[29964] LOG:  starting PostgreSQL 16devel 
on x86_64-pc-linux-gnu, compiled by gcc (Ubuntu 9.4.0-1ubuntu1~20.04.1) 9.4.0, 
64-bit
2022-09-04 15:28:37.446 CDT postmaster[29964] LOG:  listening on IPv4 address 
"127.0.0.1", port 5432
2022-09-04 15:28:37.528 CDT postmaster[29964] LOG:  listening on Unix socket 
"/tmp/.s.PGSQL.5432"
2022-09-04 15:28:37.587 CDT startup[29972] LOG:  database system was 
interrupted while in recovery at 2022-09-04 15:27:44 CDT
2022-09-04 15:28:37.587 CDT startup[29972] HINT:  This probably means that some 
data is corrupted and you will have to use the last backup for recovery.
2022-09-04 15:28:38.010 CDT startup[29972] LOG:  database system was not 
properly shut down; automatic recovery in progress
2022-09-04 15:28:38.038 CDT startup[29972] LOG:  redo starts at 0/1D118C0
2022-09-04 15:28:38.039 CDT startup[29972] LOG:  could not stat file 
"pg_tblspc/16502": No such file or directory
2022-09-04 15:28:38.039 CDT startup[29972] CONTEXT:  WAL redo at 0/1D11970 for 
Tablespace/DROP: 16502
2022-09-04 15:28:38.039 CDT startup[29972] FATAL:  failed to restore block image
2022-09-04 15:28:38.039 CDT startup[29972] DETAIL:  image at 0/1D11CB8 
compressed with zstd not supported by build, block 0
2022-09-04 15:28:38.039 CDT startup[29972] CONTEXT:  WAL redo at 0/1D11CB8 for 
Heap/DELETE: off 50 flags 0x00 KEYS_UPDATED ; blkref #0: rel 1663/16384/2610, 
blk 4 FPW

diff --git a/src/backend/access/transam/xlogutils.c 
b/src/backend/access/transam/xlogutils.c
index 0cda22597fe..01c7454bcc7 100644
--- a/src/backend/access/transam/xlogutils.c
+++ b/src/backend/access/transam/xlogutils.c
@@ -393,7 +393,11 @@ XLogReadBufferForRedoExtended(XLogReaderState *record,
                                                                          
prefetch_buffer);
                page = BufferGetPage(*buf);
                if (!RestoreBlockImage(record, block_id, page))
-                       elog(ERROR, "failed to restore block image");
+                       ereport(ERROR,
+                                       errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+                                       errmsg("failed to restore block image"),
+                                       errdetail("%s", record->errormsg_buf));
+
 
                /*
                 * The page may be uninitialized. If so, we can't set the LSN 
because

Re: Different compression methods for FPI

Reply via email to