On Tue, Jun 01, 2021 at 11:06:53AM +0900, Michael Paquier wrote: > - Speed and CPU usage. We should worry about that for CPU-bounded > environments. > - Compression ratio, which is just monitoring the difference in WAL. > - Effect of the level of compression perhaps? > - Use a fixed amount of WAL generated, meaning a set of repeatable SQL > queries, with one backend, no benchmarks like pgbench. > - Avoid any I/O bottleneck, so run tests on a tmpfs or ramfs. > - Avoid any extra WAL interference, like checkpoints, no autovacuum > running in parallel.
I think it's more nuanced than just finding the algorithm with the least CPU use. The GUC is PGC_USERSET, and it's possible that a data-loading process might want to use zlib for better compress ratio, but an interactive OLTP process might want to use lz4 or no compression for better responsiveness. Reducing WAL volume during loading can be important - at one site, their SAN was too slow to keep up during their period of heaviest loading, the checkpointer fell behind, WAL couldn't be recycled as normal, and the (local) WAL filesystem overflowed, and then the oversized WAL then needed to be replayed, to the slow SAN. A large fraction of their WAL is FPI, and compression now made this a non-issue. We'd happily incur 2x more CPU cost if WAL were 25% smaller. We're not proposing to enable it by default, so the threshold doesn't have to be "no performance regression" relative to no compression. The feature should provide a faster alternative to PGLZ, and also a method with better compression ratio to improve the case of heavy WAL writes, by reducing I/O from FPI. In a CPU-bound environment, one would just disable WAL compression, or use LZ4 if it's cheap enough. In the IO bound case, someone might enable zlib or zstd compression. I found this old thread about btree performance with wal compression (+Peter, +Andres). https://www.postgresql.org/message-id/flat/540584F2-A554-40C1-8F59-87AF8D623BB7%40yandex-team.ru#94c0dcaa34e3170992749f6fdc8db35c And the differences are pretty dramatic, so I ran a single test on my PC: CREATE TABLE t AS SELECT generate_series(1,999999)a; VACUUM t; SET wal_compression= off; \set QUIET \\ \timing on \\ SET max_parallel_maintenance_workers=0; SELECT pg_stat_reset_shared('wal'); begin; CREATE INDEX ON t(a); rollback; SELECT * FROM pg_stat_wal; Time: 1639.375 ms (00:01.639) wal_bytes | 20357193 pglz writes ~half as much, but takes twice as long as uncompressed: |Time: 3362.912 ms (00:03.363) |wal_bytes | 11644224 zlib writes ~4x less than ncompressed, and still much faster than pglz |Time: 2167.474 ms (00:02.167) |wal_bytes | 5611653 lz4 is as fast as uncompressed, and writes a bit more than pglz: |Time: 1612.874 ms (00:01.613) |wal_bytes | 12397123 zstd(6) is slower than lz4, but compresses better than anything but zlib. |Time: 1808.881 ms (00:01.809) |wal_bytes | 6395993 In this patch series, I added compression information to the errcontext from xlog_block_info(), and allow specifying compression levels like zlib-2. I'll rearrange that commit earlier if we decide that's desirable to include.
>From d006044bdd6272fa4e37890b8e634b0b2a179dff Mon Sep 17 00:00:00 2001 From: Andrey Borodin <amboro...@acm.org> Date: Sat, 27 Feb 2021 09:03:50 +0500 Subject: [PATCH v9 1/9] Allow alternate compression methods for wal_compression TODO: bump XLOG_PAGE_MAGIC --- doc/src/sgml/config.sgml | 9 +- doc/src/sgml/installation.sgml | 4 +- src/backend/Makefile | 2 +- src/backend/access/transam/xlog.c | 14 ++- src/backend/access/transam/xloginsert.c | 65 ++++++++++-- src/backend/access/transam/xlogreader.c | 99 ++++++++++++++++--- src/backend/utils/misc/guc.c | 21 ++-- src/backend/utils/misc/postgresql.conf.sample | 2 +- src/bin/pg_waldump/pg_waldump.c | 13 ++- src/include/access/xlog.h | 2 +- src/include/access/xlog_internal.h | 10 ++ src/include/access/xlogrecord.h | 15 ++- 12 files changed, 207 insertions(+), 49 deletions(-) diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml index aa3e178240..1df56d8034 100644 --- a/doc/src/sgml/config.sgml +++ b/doc/src/sgml/config.sgml @@ -3127,23 +3127,26 @@ include_dir 'conf.d' </varlistentry> <varlistentry id="guc-wal-compression" xreflabel="wal_compression"> - <term><varname>wal_compression</varname> (<type>boolean</type>) + <term><varname>wal_compression</varname> (<type>enum</type>) <indexterm> <primary><varname>wal_compression</varname> configuration parameter</primary> </indexterm> </term> <listitem> <para> - When this parameter is <literal>on</literal>, the <productname>PostgreSQL</productname> + This parameter enables compression of WAL using the specified + compression method. + When enabled, the <productname>PostgreSQL</productname> server compresses full page images written to WAL when <xref linkend="guc-full-page-writes"/> is on or during a base backup. A compressed page image will be decompressed during WAL replay. + The supported methods are pglz and zlib. The default value is <literal>off</literal>. Only superusers can change this setting. </para> <para> - Turning this parameter on can reduce the WAL volume without + Enabling compression can reduce the WAL volume without increasing the risk of unrecoverable data corruption, but at the cost of some extra CPU spent on the compression during WAL logging and on the decompression during WAL replay. diff --git a/doc/src/sgml/installation.sgml b/doc/src/sgml/installation.sgml index 3c0aa118c7..073d5089f7 100644 --- a/doc/src/sgml/installation.sgml +++ b/doc/src/sgml/installation.sgml @@ -147,7 +147,7 @@ su - postgres specify the <option>--without-zlib</option> option to <filename>configure</filename>. Using this option disables support for compressed archives in <application>pg_dump</application> and - <application>pg_restore</application>. + <application>pg_restore</application>, and compressed WAL. </para> </listitem> </itemizedlist> @@ -1236,7 +1236,7 @@ build-postgresql: Prevents use of the <application>Zlib</application> library. This disables support for compressed archives in <application>pg_dump</application> - and <application>pg_restore</application>. + and <application>pg_restore</application> and compressed WAL. </para> </listitem> </varlistentry> diff --git a/src/backend/Makefile b/src/backend/Makefile index 0da848b1fd..3af216ddfc 100644 --- a/src/backend/Makefile +++ b/src/backend/Makefile @@ -48,7 +48,7 @@ OBJS = \ LIBS := $(filter-out -lpgport -lpgcommon, $(LIBS)) $(LDAP_LIBS_BE) $(ICU_LIBS) # The backend doesn't need everything that's in LIBS, however -LIBS := $(filter-out -lz -lreadline -ledit -ltermcap -lncurses -lcurses, $(LIBS)) +LIBS := $(filter-out -lreadline -ledit -ltermcap -lncurses -lcurses, $(LIBS)) ifeq ($(with_systemd),yes) LIBS += -lsystemd diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c index 17eeff0720..1ccc51575a 100644 --- a/src/backend/access/transam/xlog.c +++ b/src/backend/access/transam/xlog.c @@ -98,7 +98,7 @@ char *XLogArchiveCommand = NULL; bool EnableHotStandby = false; bool fullPageWrites = true; bool wal_log_hints = false; -bool wal_compression = false; +int wal_compression = WAL_COMPRESSION_NONE; char *wal_consistency_checking_string = NULL; bool *wal_consistency_checking = NULL; bool wal_init_zero = true; @@ -10470,7 +10470,17 @@ xlog_block_info(StringInfo buf, XLogReaderState *record) rnode.spcNode, rnode.dbNode, rnode.relNode, blk); if (XLogRecHasBlockImage(record, block_id)) - appendStringInfoString(buf, " FPW"); + { + int compression = + BKPIMAGE_IS_COMPRESSED(record->blocks[block_id].bimg_info) ? + BKPIMAGE_COMPRESSION(record->blocks[block_id].bimg_info) : -1; + if (compression == -1) + appendStringInfoString(buf, " FPW"); + else + appendStringInfo(buf, " FPW, compression method %d/%s", + compression, wal_compression_name(compression)); + } + } } diff --git a/src/backend/access/transam/xloginsert.c b/src/backend/access/transam/xloginsert.c index 32b4cc84e7..4f81f19c49 100644 --- a/src/backend/access/transam/xloginsert.c +++ b/src/backend/access/transam/xloginsert.c @@ -33,8 +33,18 @@ #include "storage/proc.h" #include "utils/memutils.h" +#ifdef HAVE_LIBZ +#include <zlib.h> +/* zlib compressBound is not a macro */ +#define ZLIB_MAX_BLCKSZ BLCKSZ + (BLCKSZ>>12) + (BLCKSZ>>14) + (BLCKSZ>>25) + 13 +#else +#define ZLIB_MAX_BLCKSZ 0 +#endif + /* Buffer size required to store a compressed version of backup block image */ -#define PGLZ_MAX_BLCKSZ PGLZ_MAX_OUTPUT(BLCKSZ) +#define PGLZ_MAX_BLCKSZ PGLZ_MAX_OUTPUT(BLCKSZ) + +#define COMPRESS_BUFSIZE Max(PGLZ_MAX_BLCKSZ, ZLIB_MAX_BLCKSZ) /* * For each block reference registered with XLogRegisterBuffer, we fill in @@ -58,7 +68,7 @@ typedef struct * backup block data in XLogRecordAssemble() */ /* buffer to store a compressed version of backup block image */ - char compressed_page[PGLZ_MAX_BLCKSZ]; + char compressed_page[COMPRESS_BUFSIZE]; } registered_buffer; static registered_buffer *registered_buffers; @@ -113,7 +123,8 @@ static XLogRecData *XLogRecordAssemble(RmgrId rmid, uint8 info, XLogRecPtr RedoRecPtr, bool doPageWrites, XLogRecPtr *fpw_lsn, int *num_fpi); static bool XLogCompressBackupBlock(char *page, uint16 hole_offset, - uint16 hole_length, char *dest, uint16 *dlen); + uint16 hole_length, char *dest, + uint16 *dlen, WalCompression compression); /* * Begin constructing a WAL record. This must be called before the @@ -628,13 +639,14 @@ XLogRecordAssemble(RmgrId rmid, uint8 info, /* * Try to compress a block image if wal_compression is enabled */ - if (wal_compression) + if (wal_compression != WAL_COMPRESSION_NONE) { is_compressed = XLogCompressBackupBlock(page, bimg.hole_offset, cbimg.hole_length, regbuf->compressed_page, - &compressed_len); + &compressed_len, + wal_compression); } /* @@ -665,8 +677,13 @@ XLogRecordAssemble(RmgrId rmid, uint8 info, if (is_compressed) { + /* The current compression is stored in the WAL record */ + wal_compression_name(wal_compression); /* Range check */ + Assert(wal_compression < (1 << BKPIMAGE_COMPRESS_BITS)); + bimg.length = compressed_len; - bimg.bimg_info |= BKPIMAGE_IS_COMPRESSED; + bimg.bimg_info |= + wal_compression << BKPIMAGE_COMPRESS_OFFSET_BITS; rdt_datas_last->data = regbuf->compressed_page; rdt_datas_last->len = compressed_len; @@ -827,7 +844,7 @@ XLogRecordAssemble(RmgrId rmid, uint8 info, */ static bool XLogCompressBackupBlock(char *page, uint16 hole_offset, uint16 hole_length, - char *dest, uint16 *dlen) + char *dest, uint16 *dlen, WalCompression compression) { int32 orig_len = BLCKSZ - hole_length; int32 len; @@ -853,12 +870,42 @@ XLogCompressBackupBlock(char *page, uint16 hole_offset, uint16 hole_length, else source = page; + switch (compression) + { + case WAL_COMPRESSION_PGLZ: + len = pglz_compress(source, orig_len, dest, PGLZ_strategy_default); + break; + +#ifdef HAVE_LIBZ + case WAL_COMPRESSION_ZLIB: + { + unsigned long len_l = COMPRESS_BUFSIZE; + int ret; + ret = compress2((Bytef*)dest, &len_l, (Bytef*)source, orig_len, 1); + if (ret != Z_OK) + len_l = -1; + len = len_l; + break; + } +#endif + + default: + /* + * It should be impossible to get here for unsupported algorithms, + * which cannot be assigned if they're not enabled at compile time. + */ + ereport(ERROR, + (errcode(ERRCODE_FEATURE_NOT_SUPPORTED), + errmsg("unknown compression method requested: %d/%s", + compression, wal_compression_name(compression)))); + + } + /* - * We recheck the actual size even if pglz_compress() reports success and + * We recheck the actual size even if compression reports success and * see if the number of bytes saved by compression is larger than the * length of extra data needed for the compressed version of block image. */ - len = pglz_compress(source, orig_len, dest, PGLZ_strategy_default); if (len >= 0 && len + extra_bytes < orig_len) { diff --git a/src/backend/access/transam/xlogreader.c b/src/backend/access/transam/xlogreader.c index 42738eb940..3a922caaeb 100644 --- a/src/backend/access/transam/xlogreader.c +++ b/src/backend/access/transam/xlogreader.c @@ -26,6 +26,7 @@ #include "catalog/pg_control.h" #include "common/pg_lzcompress.h" #include "replication/origin.h" +#include "utils/guc.h" #ifndef FRONTEND #include "miscadmin.h" @@ -33,6 +34,10 @@ #include "utils/memutils.h" #endif +#ifdef HAVE_LIBZ +#include <zlib.h> +#endif + static void report_invalid_record(XLogReaderState *state, const char *fmt,...) pg_attribute_printf(2, 3); static bool allocate_recordbuf(XLogReaderState *state, uint32 reclength); @@ -50,6 +55,29 @@ static void WALOpenSegmentInit(WALOpenSegment *seg, WALSegmentContext *segcxt, /* size of the buffer allocated for error message. */ #define MAX_ERRORMSG_LEN 1000 +/* + * Accept the likely variants for none and pglz, for compatibility with old + * server versions where wal_compression was a boolean. + */ +const struct config_enum_entry wal_compression_options[] = { + {"off", WAL_COMPRESSION_NONE, false}, + {"none", WAL_COMPRESSION_NONE, false}, + {"false", WAL_COMPRESSION_NONE, true}, + {"no", WAL_COMPRESSION_NONE, true}, + {"0", WAL_COMPRESSION_NONE, true}, + {"pglz", WAL_COMPRESSION_PGLZ, false}, + {"true", WAL_COMPRESSION_PGLZ, true}, + {"yes", WAL_COMPRESSION_PGLZ, true}, + {"on", WAL_COMPRESSION_PGLZ, true}, + {"1", WAL_COMPRESSION_PGLZ, true}, + +#ifdef HAVE_LIBZ + {"zlib", WAL_COMPRESSION_ZLIB, false}, +#endif + + {NULL, 0, false} +}; + /* * Construct a string in state->errormsg_buf explaining what's wrong with * the current record being read. @@ -1290,7 +1318,7 @@ DecodeXLogRecord(XLogReaderState *state, XLogRecord *record, char **errormsg) blk->apply_image = ((blk->bimg_info & BKPIMAGE_APPLY) != 0); - if (blk->bimg_info & BKPIMAGE_IS_COMPRESSED) + if (BKPIMAGE_IS_COMPRESSED(blk->bimg_info)) { if (blk->bimg_info & BKPIMAGE_HAS_HOLE) COPY_HEADER_FIELD(&blk->hole_length, sizeof(uint16)); @@ -1335,29 +1363,28 @@ DecodeXLogRecord(XLogReaderState *state, XLogRecord *record, char **errormsg) } /* - * cross-check that bimg_len < BLCKSZ if the IS_COMPRESSED - * flag is set. + * cross-check that bimg_len < BLCKSZ if it's compressed */ - if ((blk->bimg_info & BKPIMAGE_IS_COMPRESSED) && + if (BKPIMAGE_IS_COMPRESSED(blk->bimg_info) && blk->bimg_len == BLCKSZ) { report_invalid_record(state, - "BKPIMAGE_IS_COMPRESSED set, but block image length %u at %X/%X", + "BKPIMAGE_IS_COMPRESSED, but block image length %u at %X/%X", (unsigned int) blk->bimg_len, LSN_FORMAT_ARGS(state->ReadRecPtr)); goto err; } /* - * cross-check that bimg_len = BLCKSZ if neither HAS_HOLE nor - * IS_COMPRESSED flag is set. + * cross-check that bimg_len = BLCKSZ if neither HAS_HOLE is + * set nor IS_COMPRESSED(). */ if (!(blk->bimg_info & BKPIMAGE_HAS_HOLE) && - !(blk->bimg_info & BKPIMAGE_IS_COMPRESSED) && + !BKPIMAGE_IS_COMPRESSED(blk->bimg_info) && blk->bimg_len != BLCKSZ) { report_invalid_record(state, - "neither BKPIMAGE_HAS_HOLE nor BKPIMAGE_IS_COMPRESSED set, but block image length is %u at %X/%X", + "neither BKPIMAGE_HAS_HOLE nor BKPIMAGE_IS_COMPRESSED, but block image length is %u at %X/%X", (unsigned int) blk->data_len, LSN_FORMAT_ARGS(state->ReadRecPtr)); goto err; @@ -1535,6 +1562,22 @@ XLogRecGetBlockData(XLogReaderState *record, uint8 block_id, Size *len) } } +/* + * Return a statically allocated string associated with the given compression + * method. + */ +const char * +wal_compression_name(WalCompression compression) +{ + for (int i=0; wal_compression_options[i].name != NULL; ++i) + { + if (wal_compression_options[i].val == compression) + return wal_compression_options[i].name; + } + + return "???"; +} + /* * Restore a full-page image from a backup block attached to an XLOG record. * @@ -1555,11 +1598,43 @@ RestoreBlockImage(XLogReaderState *record, uint8 block_id, char *page) bkpb = &record->blocks[block_id]; ptr = bkpb->bkp_image; - if (bkpb->bimg_info & BKPIMAGE_IS_COMPRESSED) + if (BKPIMAGE_IS_COMPRESSED(bkpb->bimg_info)) { + int compression_method = BKPIMAGE_COMPRESSION(bkpb->bimg_info); /* If a backup block image is compressed, decompress it */ - if (pglz_decompress(ptr, bkpb->bimg_len, tmp.data, - BLCKSZ - bkpb->hole_length, true) < 0) + int32 decomp_result = -1; + switch (compression_method) + { + case WAL_COMPRESSION_PGLZ: + decomp_result = pglz_decompress(ptr, bkpb->bimg_len, tmp.data, + BLCKSZ - bkpb->hole_length, true); + break; + +#ifdef HAVE_LIBZ + case WAL_COMPRESSION_ZLIB: + { + unsigned long decomp_result_l; + decomp_result_l = BLCKSZ - bkpb->hole_length; + if (uncompress((Bytef*)tmp.data, &decomp_result_l, + (Bytef*)ptr, bkpb->bimg_len) == Z_OK) + decomp_result = decomp_result_l; + else + decomp_result = -1; + break; + } +#endif + + default: + report_invalid_record(record, "image at %X/%X is compressed with unsupported codec, block %d (%d/%s)", + (uint32) (record->ReadRecPtr >> 32), + (uint32) record->ReadRecPtr, + block_id, + compression_method, + wal_compression_name(compression_method)); + return false; + } + + if (decomp_result < 0) { report_invalid_record(record, "invalid compressed image at %X/%X, block %d", LSN_FORMAT_ARGS(record->ReadRecPtr), diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c index 68b62d523d..ce1149bed5 100644 --- a/src/backend/utils/misc/guc.c +++ b/src/backend/utils/misc/guc.c @@ -548,6 +548,7 @@ extern const struct config_enum_entry archive_mode_options[]; extern const struct config_enum_entry recovery_target_action_options[]; extern const struct config_enum_entry sync_method_options[]; extern const struct config_enum_entry dynamic_shared_memory_options[]; +extern const struct config_enum_entry wal_compression_options[]; /* * GUC option variables that are exported from this module @@ -1304,16 +1305,6 @@ static struct config_bool ConfigureNamesBool[] = NULL, NULL, NULL }, - { - {"wal_compression", PGC_SUSET, WAL_SETTINGS, - gettext_noop("Compresses full-page writes written in WAL file."), - NULL - }, - &wal_compression, - false, - NULL, NULL, NULL - }, - { {"wal_init_zero", PGC_SUSET, WAL_SETTINGS, gettext_noop("Writes zeroes to new WAL files before first use."), @@ -4825,6 +4816,16 @@ static struct config_enum ConfigureNamesEnum[] = NULL, NULL, NULL }, + { + {"wal_compression", PGC_SUSET, WAL_SETTINGS, + gettext_noop("Set the method used to compress full page images in the WAL."), + NULL + }, + &wal_compression, + WAL_COMPRESSION_NONE, wal_compression_options, + NULL, NULL, NULL + }, + { {"dynamic_shared_memory_type", PGC_POSTMASTER, RESOURCES_MEM, gettext_noop("Selects the dynamic shared memory implementation used."), diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample index ddbb6dc2be..61786c6e07 100644 --- a/src/backend/utils/misc/postgresql.conf.sample +++ b/src/backend/utils/misc/postgresql.conf.sample @@ -218,7 +218,7 @@ #full_page_writes = on # recover from partial page writes #wal_log_hints = off # also do full page writes of non-critical updates # (change requires restart) -#wal_compression = off # enable compression of full-page writes +#wal_compression = off # enable compression of full-page writes: off, pglz, zlib #wal_init_zero = on # zero-fill new WAL files #wal_recycle = on # recycle WAL files #wal_buffers = -1 # min 32kB, -1 sets based on shared_buffers diff --git a/src/bin/pg_waldump/pg_waldump.c b/src/bin/pg_waldump/pg_waldump.c index f8b8afe4a7..1cd71ac2f7 100644 --- a/src/bin/pg_waldump/pg_waldump.c +++ b/src/bin/pg_waldump/pg_waldump.c @@ -537,18 +537,21 @@ XLogDumpDisplayRecord(XLogDumpConfig *config, XLogReaderState *record) blk); if (XLogRecHasBlockImage(record, block_id)) { - if (record->blocks[block_id].bimg_info & - BKPIMAGE_IS_COMPRESSED) + if (BKPIMAGE_IS_COMPRESSED(record->blocks[block_id].bimg_info)) { + int compression = BKPIMAGE_COMPRESSION( + record->blocks[block_id].bimg_info); + printf(" (FPW%s); hole: offset: %u, length: %u, " - "compression saved: %u", + "compression method %d/%s, saved: %u", XLogRecBlockImageApply(record, block_id) ? "" : " for WAL verification", record->blocks[block_id].hole_offset, record->blocks[block_id].hole_length, + compression, wal_compression_name(compression), BLCKSZ - - record->blocks[block_id].hole_length - - record->blocks[block_id].bimg_len); + record->blocks[block_id].hole_length - + record->blocks[block_id].bimg_len); } else { diff --git a/src/include/access/xlog.h b/src/include/access/xlog.h index 77187c12be..e8b2c53784 100644 --- a/src/include/access/xlog.h +++ b/src/include/access/xlog.h @@ -116,7 +116,7 @@ extern char *XLogArchiveCommand; extern bool EnableHotStandby; extern bool fullPageWrites; extern bool wal_log_hints; -extern bool wal_compression; +extern int wal_compression; extern bool wal_init_zero; extern bool wal_recycle; extern bool *wal_consistency_checking; diff --git a/src/include/access/xlog_internal.h b/src/include/access/xlog_internal.h index 26a743b6b6..8b740af66d 100644 --- a/src/include/access/xlog_internal.h +++ b/src/include/access/xlog_internal.h @@ -324,4 +324,14 @@ extern bool InArchiveRecovery; extern bool StandbyMode; extern char *recoveryRestoreCommand; +/* These are the compression IDs written into bimg_info */ +typedef enum WalCompression +{ + WAL_COMPRESSION_NONE, + WAL_COMPRESSION_PGLZ, + WAL_COMPRESSION_ZLIB, +} WalCompression; + +extern const char *wal_compression_name(WalCompression compression); + #endif /* XLOG_INTERNAL_H */ diff --git a/src/include/access/xlogrecord.h b/src/include/access/xlogrecord.h index 80c92a2498..2a60c0fb92 100644 --- a/src/include/access/xlogrecord.h +++ b/src/include/access/xlogrecord.h @@ -114,7 +114,7 @@ typedef struct XLogRecordBlockHeader * present is (BLCKSZ - <length of "hole" bytes>). * * Additionally, when wal_compression is enabled, we will try to compress full - * page images using the PGLZ compression algorithm, after removing the "hole". + * page images, after removing the "hole". * This can reduce the WAL volume, but at some extra cost of CPU spent * on the compression during WAL logging. In this case, since the "hole" * length cannot be calculated by subtracting the number of page image bytes @@ -144,9 +144,18 @@ typedef struct XLogRecordBlockImageHeader /* Information stored in bimg_info */ #define BKPIMAGE_HAS_HOLE 0x01 /* page image has "hole" */ -#define BKPIMAGE_IS_COMPRESSED 0x02 /* page image is compressed */ -#define BKPIMAGE_APPLY 0x04 /* page image should be restored during +#define BKPIMAGE_APPLY 0x02 /* page image should be restored during * replay */ +#define BKPIMAGE_COMPRESS_METHOD1 0x04 /* bits to encode compression method */ +#define BKPIMAGE_COMPRESS_METHOD2 0x08 /* 0=none, 1=pglz, 2=zlib */ + +/* How many bits to shift to extract compression */ +#define BKPIMAGE_COMPRESS_OFFSET_BITS 2 +/* How many bits are for compression */ +#define BKPIMAGE_COMPRESS_BITS 2 +/* Extract the compression from the bimg_info */ +#define BKPIMAGE_COMPRESSION(info) ((info >> BKPIMAGE_COMPRESS_OFFSET_BITS) & ((1<<BKPIMAGE_COMPRESS_BITS) - 1)) +#define BKPIMAGE_IS_COMPRESSED(info) (BKPIMAGE_COMPRESSION(info) != 0) /* * Extra header information used when page image has "hole" and -- 2.17.0
>From dc75a78fd4b8f4b036b9d06b1b8d48aab84ccd5c Mon Sep 17 00:00:00 2001 From: Kyotaro Horiguchi <horikyota....@gmail.com> Date: Mon, 8 Mar 2021 15:32:30 +0900 Subject: [PATCH v9 2/9] Run 011_crash_recovery.pl with wal_level=minimal The test doesn't need that feature and pg_current_xact_id() is better exercised by turning off the feature. Copied from: https://www.postgresql.org/message-id/20210308.173242.463790587797836129.horikyota.ntt%40gmail.com --- src/test/recovery/t/011_crash_recovery.pl | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/test/recovery/t/011_crash_recovery.pl b/src/test/recovery/t/011_crash_recovery.pl index a26e99500b..2e7e3db639 100644 --- a/src/test/recovery/t/011_crash_recovery.pl +++ b/src/test/recovery/t/011_crash_recovery.pl @@ -14,7 +14,7 @@ use Config; plan tests => 3; my $node = get_new_node('primary'); -$node->init(allows_streaming => 1); +$node->init(); $node->start; my ($stdin, $stdout, $stderr) = ('', '', ''); -- 2.17.0
>From db62efed575e3f1a30a84ae8e525915afe3c7a94 Mon Sep 17 00:00:00 2001 From: Kyotaro Horiguchi <horikyota....@gmail.com> Date: Mon, 8 Mar 2021 15:43:01 +0900 Subject: [PATCH v9 3/9] Make sure published XIDs are persistent pg_xact_status() premises that XIDs obtained by pg_current_xact_id(_if_assigned)() are persistent beyond a crash. But XIDs are not guaranteed to go beyond WAL buffers before commit and thus XIDs may vanish if server crashes before commit. This patch guarantees the XID shown by the functions to be flushed out to disk. Copied from: https://www.postgresql.org/message-id/20210308.173242.463790587797836129.horikyota.ntt%40gmail.com --- src/backend/access/transam/xact.c | 55 +++++++++++++++++++++++++------ src/backend/access/transam/xlog.c | 2 +- src/backend/utils/adt/xid8funcs.c | 12 ++++++- src/include/access/xact.h | 3 +- 4 files changed, 59 insertions(+), 13 deletions(-) diff --git a/src/backend/access/transam/xact.c b/src/backend/access/transam/xact.c index 441445927e..da8a460722 100644 --- a/src/backend/access/transam/xact.c +++ b/src/backend/access/transam/xact.c @@ -201,7 +201,7 @@ typedef struct TransactionStateData int prevSecContext; /* previous SecurityRestrictionContext */ bool prevXactReadOnly; /* entry-time xact r/o state */ bool startedInRecovery; /* did we start in recovery? */ - bool didLogXid; /* has xid been included in WAL record? */ + XLogRecPtr minLSN; /* LSN needed to reach to record the xid */ int parallelModeLevel; /* Enter/ExitParallelMode counter */ bool chain; /* start a new block after this one */ bool assigned; /* assigned to top-level XID */ @@ -520,14 +520,46 @@ GetCurrentFullTransactionIdIfAny(void) * MarkCurrentTransactionIdLoggedIfAny * * Remember that the current xid - if it is assigned - now has been wal logged. + * + * upto is the LSN up to which we need to flush WAL to ensure the current xid + * to be persistent. See EnsureCurrentTransactionIdLogged(). */ void -MarkCurrentTransactionIdLoggedIfAny(void) +MarkCurrentTransactionIdLoggedIfAny(XLogRecPtr upto) { - if (FullTransactionIdIsValid(CurrentTransactionState->fullTransactionId)) - CurrentTransactionState->didLogXid = true; + if (FullTransactionIdIsValid(CurrentTransactionState->fullTransactionId) && + XLogRecPtrIsInvalid(CurrentTransactionState->minLSN)) + CurrentTransactionState->minLSN = upto; } +/* + * EnsureCurrentTransactionIdLogged + * + * Make sure that the current top XID is WAL-logged. + */ +void +EnsureTopTransactionIdLogged(void) +{ + /* + * We need at least one WAL record for the current top transaction to be + * flushed out. Write one if we don't have one yet. + */ + if (XLogRecPtrIsInvalid(TopTransactionStateData.minLSN)) + { + xl_xact_assignment xlrec; + + xlrec.xtop = XidFromFullTransactionId(XactTopFullTransactionId); + Assert(TransactionIdIsValid(xlrec.xtop)); + xlrec.nsubxacts = 0; + + XLogBeginInsert(); + XLogRegisterData((char *) &xlrec, MinSizeOfXactAssignment); + TopTransactionStateData.minLSN = + XLogInsert(RM_XACT_ID, XLOG_XACT_ASSIGNMENT); + } + + XLogFlush(TopTransactionStateData.minLSN); +} /* * GetStableLatestTransactionId @@ -616,14 +648,14 @@ AssignTransactionId(TransactionState s) * When wal_level=logical, guarantee that a subtransaction's xid can only * be seen in the WAL stream if its toplevel xid has been logged before. * If necessary we log an xact_assignment record with fewer than - * PGPROC_MAX_CACHED_SUBXIDS. Note that it is fine if didLogXid isn't set + * PGPROC_MAX_CACHED_SUBXIDS. Note that it is fine if minLSN isn't set * for a transaction even though it appears in a WAL record, we just might * superfluously log something. That can happen when an xid is included * somewhere inside a wal record, but not in XLogRecord->xl_xid, like in * xl_standby_locks. */ if (isSubXact && XLogLogicalInfoActive() && - !TopTransactionStateData.didLogXid) + XLogRecPtrIsInvalid(TopTransactionStateData.minLSN)) log_unknown_top = true; /* @@ -693,6 +725,7 @@ AssignTransactionId(TransactionState s) log_unknown_top) { xl_xact_assignment xlrec; + XLogRecPtr endptr; /* * xtop is always set by now because we recurse up transaction @@ -707,11 +740,13 @@ AssignTransactionId(TransactionState s) XLogRegisterData((char *) unreportedXids, nUnreportedXids * sizeof(TransactionId)); - (void) XLogInsert(RM_XACT_ID, XLOG_XACT_ASSIGNMENT); + endptr = XLogInsert(RM_XACT_ID, XLOG_XACT_ASSIGNMENT); nUnreportedXids = 0; - /* mark top, not current xact as having been logged */ - TopTransactionStateData.didLogXid = true; + + /* set minLSN of top, not of current xact if not yet */ + if (XLogRecPtrIsInvalid(TopTransactionStateData.minLSN)) + TopTransactionStateData.minLSN = endptr; } } } @@ -1996,7 +2031,7 @@ StartTransaction(void) * initialize reported xid accounting */ nUnreportedXids = 0; - s->didLogXid = false; + s->minLSN = InvalidXLogRecPtr; /* * must initialize resource-management stuff first diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c index 1ccc51575a..f7909aa1d7 100644 --- a/src/backend/access/transam/xlog.c +++ b/src/backend/access/transam/xlog.c @@ -1162,7 +1162,7 @@ XLogInsertRecord(XLogRecData *rdata, */ WALInsertLockRelease(); - MarkCurrentTransactionIdLoggedIfAny(); + MarkCurrentTransactionIdLoggedIfAny(EndPos); END_CRIT_SECTION(); diff --git a/src/backend/utils/adt/xid8funcs.c b/src/backend/utils/adt/xid8funcs.c index cc2b4ac797..992482f8c8 100644 --- a/src/backend/utils/adt/xid8funcs.c +++ b/src/backend/utils/adt/xid8funcs.c @@ -357,6 +357,8 @@ bad_format: Datum pg_current_xact_id(PG_FUNCTION_ARGS) { + FullTransactionId xid; + /* * Must prevent during recovery because if an xid is not assigned we try * to assign one, which would fail. Programs already rely on this function @@ -365,7 +367,12 @@ pg_current_xact_id(PG_FUNCTION_ARGS) */ PreventCommandDuringRecovery("pg_current_xact_id()"); - PG_RETURN_FULLTRANSACTIONID(GetTopFullTransactionId()); + xid = GetTopFullTransactionId(); + + /* the XID is going to be published, make sure it is psersistent */ + EnsureTopTransactionIdLogged(); + + PG_RETURN_FULLTRANSACTIONID(xid); } /* @@ -380,6 +387,9 @@ pg_current_xact_id_if_assigned(PG_FUNCTION_ARGS) if (!FullTransactionIdIsValid(topfxid)) PG_RETURN_NULL(); + /* the XID is going to be published, make sure it is psersistent */ + EnsureTopTransactionIdLogged(); + PG_RETURN_FULLTRANSACTIONID(topfxid); } diff --git a/src/include/access/xact.h b/src/include/access/xact.h index 134f6862da..593a4140df 100644 --- a/src/include/access/xact.h +++ b/src/include/access/xact.h @@ -386,7 +386,8 @@ extern FullTransactionId GetTopFullTransactionId(void); extern FullTransactionId GetTopFullTransactionIdIfAny(void); extern FullTransactionId GetCurrentFullTransactionId(void); extern FullTransactionId GetCurrentFullTransactionIdIfAny(void); -extern void MarkCurrentTransactionIdLoggedIfAny(void); +extern void MarkCurrentTransactionIdLoggedIfAny(XLogRecPtr upto); +extern void EnsureTopTransactionIdLogged(void); extern bool SubTransactionIsActive(SubTransactionId subxid); extern CommandId GetCurrentCommandId(bool used); extern void SetParallelStartTimestamps(TimestampTz xact_ts, TimestampTz stmt_ts); -- 2.17.0
>From 2a899bbbbca7d18ce79258001de910a1715bed3a Mon Sep 17 00:00:00 2001 From: Justin Pryzby <pryz...@telsasoft.com> Date: Thu, 11 Mar 2021 17:36:24 -0600 Subject: [PATCH v9 4/9] wal_compression_method: default to zlib.. this is meant to exercise the CIs, and not meant to be merged --- src/backend/access/transam/xlog.c | 2 +- src/backend/utils/misc/guc.c | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c index f7909aa1d7..3254c42243 100644 --- a/src/backend/access/transam/xlog.c +++ b/src/backend/access/transam/xlog.c @@ -98,7 +98,7 @@ char *XLogArchiveCommand = NULL; bool EnableHotStandby = false; bool fullPageWrites = true; bool wal_log_hints = false; -int wal_compression = WAL_COMPRESSION_NONE; +int wal_compression = WAL_COMPRESSION_ZLIB; char *wal_consistency_checking_string = NULL; bool *wal_consistency_checking = NULL; bool wal_init_zero = true; diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c index ce1149bed5..14a2203225 100644 --- a/src/backend/utils/misc/guc.c +++ b/src/backend/utils/misc/guc.c @@ -4822,7 +4822,7 @@ static struct config_enum ConfigureNamesEnum[] = NULL }, &wal_compression, - WAL_COMPRESSION_NONE, wal_compression_options, + WAL_COMPRESSION_ZLIB, wal_compression_options, NULL, NULL, NULL }, -- 2.17.0
>From d953ea3a2cf103bf0650c80ad5855bed87204f81 Mon Sep 17 00:00:00 2001 From: Justin Pryzby <pryz...@telsasoft.com> Date: Mon, 24 May 2021 23:32:30 -0500 Subject: [PATCH v9 5/9] (re)add wal_compression_method: lz4 --- doc/src/sgml/config.sgml | 3 ++- doc/src/sgml/install-windows.sgml | 2 +- doc/src/sgml/installation.sgml | 5 +++-- src/backend/access/transam/xloginsert.c | 17 ++++++++++++++++- src/backend/access/transam/xlogreader.c | 15 +++++++++++++++ src/backend/utils/misc/postgresql.conf.sample | 2 +- src/include/access/xlog_internal.h | 1 + src/include/access/xlogrecord.h | 2 +- 8 files changed, 40 insertions(+), 7 deletions(-) diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml index 1df56d8034..df5ff70d91 100644 --- a/doc/src/sgml/config.sgml +++ b/doc/src/sgml/config.sgml @@ -3140,7 +3140,8 @@ include_dir 'conf.d' server compresses full page images written to WAL when <xref linkend="guc-full-page-writes"/> is on or during a base backup. A compressed page image will be decompressed during WAL replay. - The supported methods are pglz and zlib. + The supported methods are pglz, zlib, and (if configured when + <productname>PostgreSQL</productname> was built) lz4. The default value is <literal>off</literal>. Only superusers can change this setting. </para> diff --git a/doc/src/sgml/install-windows.sgml b/doc/src/sgml/install-windows.sgml index 312edc6f7a..ba794b8c93 100644 --- a/doc/src/sgml/install-windows.sgml +++ b/doc/src/sgml/install-windows.sgml @@ -299,7 +299,7 @@ $ENV{MSBFLAGS}="/m"; <term><productname>LZ4</productname></term> <listitem><para> Required for supporting <productname>LZ4</productname> compression - method for compressing the table data. Binaries and source can be + method for compressing table or WAL data. Binaries and source can be downloaded from <ulink url="https://github.com/lz4/lz4/releases"></ulink>. </para></listitem> diff --git a/doc/src/sgml/installation.sgml b/doc/src/sgml/installation.sgml index 073d5089f7..c7673a4dc8 100644 --- a/doc/src/sgml/installation.sgml +++ b/doc/src/sgml/installation.sgml @@ -270,7 +270,8 @@ su - postgres <para> You need <productname>LZ4</productname>, if you want to support compression of data with this method; see - <xref linkend="guc-default-toast-compression"/>. + <xref linkend="guc-default-toast-compression"/> and + <xref linkend="guc-wal-compression"/>. </para> </listitem> @@ -980,7 +981,7 @@ build-postgresql: <para> Build with <productname>LZ4</productname> compression support. This allows the use of <productname>LZ4</productname> for - compression of table data. + compression of table and WAL data. </para> </listitem> </varlistentry> diff --git a/src/backend/access/transam/xloginsert.c b/src/backend/access/transam/xloginsert.c index 4f81f19c49..a8794a941a 100644 --- a/src/backend/access/transam/xloginsert.c +++ b/src/backend/access/transam/xloginsert.c @@ -41,10 +41,17 @@ #define ZLIB_MAX_BLCKSZ 0 #endif +#ifdef USE_LZ4 +#include "lz4.h" +#define LZ4_MAX_BLCKSZ LZ4_COMPRESSBOUND(BLCKSZ) +#else +#define LZ4_MAX_BLCKSZ 0 +#endif + /* Buffer size required to store a compressed version of backup block image */ #define PGLZ_MAX_BLCKSZ PGLZ_MAX_OUTPUT(BLCKSZ) -#define COMPRESS_BUFSIZE Max(PGLZ_MAX_BLCKSZ, ZLIB_MAX_BLCKSZ) +#define COMPRESS_BUFSIZE Max(Max(PGLZ_MAX_BLCKSZ, ZLIB_MAX_BLCKSZ), LZ4_MAX_BLCKSZ) /* * For each block reference registered with XLogRegisterBuffer, we fill in @@ -889,6 +896,14 @@ XLogCompressBackupBlock(char *page, uint16 hole_offset, uint16 hole_length, } #endif +#ifdef USE_LZ4 + case WAL_COMPRESSION_LZ4: + len = LZ4_compress_fast(source, dest, orig_len, COMPRESS_BUFSIZE, 1); + if (len == 0) + len = -1; + break; +#endif + default: /* * It should be impossible to get here for unsupported algorithms, diff --git a/src/backend/access/transam/xlogreader.c b/src/backend/access/transam/xlogreader.c index 3a922caaeb..e44817fece 100644 --- a/src/backend/access/transam/xlogreader.c +++ b/src/backend/access/transam/xlogreader.c @@ -38,6 +38,10 @@ #include <zlib.h> #endif +#ifdef USE_LZ4 +#include "lz4.h" +#endif + static void report_invalid_record(XLogReaderState *state, const char *fmt,...) pg_attribute_printf(2, 3); static bool allocate_recordbuf(XLogReaderState *state, uint32 reclength); @@ -75,6 +79,10 @@ const struct config_enum_entry wal_compression_options[] = { {"zlib", WAL_COMPRESSION_ZLIB, false}, #endif +#ifdef USE_LZ4 + {"lz4", WAL_COMPRESSION_LZ4, false}, +#endif + {NULL, 0, false} }; @@ -1624,6 +1632,13 @@ RestoreBlockImage(XLogReaderState *record, uint8 block_id, char *page) } #endif +#ifdef USE_LZ4 + case WAL_COMPRESSION_LZ4: + decomp_result = LZ4_decompress_safe(ptr, tmp.data, + bkpb->bimg_len, BLCKSZ-bkpb->hole_length); + break; +#endif + default: report_invalid_record(record, "image at %X/%X is compressed with unsupported codec, block %d (%d/%s)", (uint32) (record->ReadRecPtr >> 32), diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample index 61786c6e07..728acef953 100644 --- a/src/backend/utils/misc/postgresql.conf.sample +++ b/src/backend/utils/misc/postgresql.conf.sample @@ -218,7 +218,7 @@ #full_page_writes = on # recover from partial page writes #wal_log_hints = off # also do full page writes of non-critical updates # (change requires restart) -#wal_compression = off # enable compression of full-page writes: off, pglz, zlib +#wal_compression = off # enable compression of full-page writes: off, pglz, zlib, lz4 #wal_init_zero = on # zero-fill new WAL files #wal_recycle = on # recycle WAL files #wal_buffers = -1 # min 32kB, -1 sets based on shared_buffers diff --git a/src/include/access/xlog_internal.h b/src/include/access/xlog_internal.h index 8b740af66d..0287592cd4 100644 --- a/src/include/access/xlog_internal.h +++ b/src/include/access/xlog_internal.h @@ -330,6 +330,7 @@ typedef enum WalCompression WAL_COMPRESSION_NONE, WAL_COMPRESSION_PGLZ, WAL_COMPRESSION_ZLIB, + WAL_COMPRESSION_LZ4, } WalCompression; extern const char *wal_compression_name(WalCompression compression); diff --git a/src/include/access/xlogrecord.h b/src/include/access/xlogrecord.h index 2a60c0fb92..abb42b364d 100644 --- a/src/include/access/xlogrecord.h +++ b/src/include/access/xlogrecord.h @@ -147,7 +147,7 @@ typedef struct XLogRecordBlockImageHeader #define BKPIMAGE_APPLY 0x02 /* page image should be restored during * replay */ #define BKPIMAGE_COMPRESS_METHOD1 0x04 /* bits to encode compression method */ -#define BKPIMAGE_COMPRESS_METHOD2 0x08 /* 0=none, 1=pglz, 2=zlib */ +#define BKPIMAGE_COMPRESS_METHOD2 0x08 /* 0=none, 1=pglz, 2=zlib, 3=lz4 */ /* How many bits to shift to extract compression */ #define BKPIMAGE_COMPRESS_OFFSET_BITS 2 -- 2.17.0
>From 20924767e23ba489c27599e91eb80fc8935d603d Mon Sep 17 00:00:00 2001 From: Justin Pryzby <pryz...@telsasoft.com> Date: Fri, 12 Mar 2021 15:35:40 -0600 Subject: [PATCH v9 6/9] Default to LZ4.. this is meant to exercise in the CIs, and not meant to be merged --- configure | 6 ++++-- configure.ac | 4 ++-- src/backend/access/transam/xlog.c | 2 +- src/backend/utils/misc/guc.c | 2 +- 4 files changed, 8 insertions(+), 6 deletions(-) diff --git a/configure b/configure index e9b98f442f..7038b0727c 100755 --- a/configure +++ b/configure @@ -1575,7 +1575,7 @@ Optional Packages: --with-system-tzdata=DIR use system time zone data in DIR --without-zlib do not use Zlib - --with-lz4 build with LZ4 support + --without-lz4 build without LZ4 support --with-gnu-ld assume the C compiler uses GNU ld [default=no] --with-ssl=LIB use LIB for SSL/TLS support (openssl) --with-openssl obsolete spelling of --with-ssl=openssl @@ -8598,7 +8598,9 @@ $as_echo "#define USE_LZ4 1" >>confdefs.h esac else - with_lz4=no + with_lz4=yes + +$as_echo "#define USE_LZ4 1" >>confdefs.h fi diff --git a/configure.ac b/configure.ac index 3b42d8bdc9..cb0261f179 100644 --- a/configure.ac +++ b/configure.ac @@ -990,8 +990,8 @@ AC_SUBST(with_zlib) # LZ4 # AC_MSG_CHECKING([whether to build with LZ4 support]) -PGAC_ARG_BOOL(with, lz4, no, [build with LZ4 support], - [AC_DEFINE([USE_LZ4], 1, [Define to 1 to build with LZ4 support. (--with-lz4)])]) +PGAC_ARG_BOOL(with, lz4, yes, [build without LZ4 support], + [AC_DEFINE([USE_LZ4], 1, [Define to 1 to build without LZ4 support. (--without-lz4)])]) AC_MSG_RESULT([$with_lz4]) AC_SUBST(with_lz4) diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c index 3254c42243..f2b0af6360 100644 --- a/src/backend/access/transam/xlog.c +++ b/src/backend/access/transam/xlog.c @@ -98,7 +98,7 @@ char *XLogArchiveCommand = NULL; bool EnableHotStandby = false; bool fullPageWrites = true; bool wal_log_hints = false; -int wal_compression = WAL_COMPRESSION_ZLIB; +int wal_compression = WAL_COMPRESSION_LZ4; char *wal_consistency_checking_string = NULL; bool *wal_consistency_checking = NULL; bool wal_init_zero = true; diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c index 14a2203225..0ad62e4d1f 100644 --- a/src/backend/utils/misc/guc.c +++ b/src/backend/utils/misc/guc.c @@ -4822,7 +4822,7 @@ static struct config_enum ConfigureNamesEnum[] = NULL }, &wal_compression, - WAL_COMPRESSION_ZLIB, wal_compression_options, + WAL_COMPRESSION_LZ4, wal_compression_options, NULL, NULL, NULL }, -- 2.17.0
>From e25090646f6a0ca619bb9faaa7ff6a330c277b0f Mon Sep 17 00:00:00 2001 From: Justin Pryzby <pryz...@telsasoft.com> Date: Fri, 12 Mar 2021 14:43:53 -0600 Subject: [PATCH v9 7/9] add wal_compression_method: zstd TODO: 9ca40dcd4d0cad43d95a9a253fafaa9a9ba7de24 --- configure | 217 ++++++++++++++++++ configure.ac | 33 +++ doc/src/sgml/config.sgml | 2 +- doc/src/sgml/installation.sgml | 19 ++ src/backend/access/transam/xloginsert.c | 18 +- src/backend/access/transam/xlogreader.c | 18 ++ src/backend/utils/misc/postgresql.conf.sample | 2 +- src/include/access/xlog_internal.h | 1 + src/include/access/xlogrecord.h | 5 +- src/include/pg_config.h.in | 3 + src/tools/msvc/Solution.pm | 1 + src/tools/msvc/config_default.pl | 1 + 12 files changed, 315 insertions(+), 5 deletions(-) diff --git a/configure b/configure index 7038b0727c..72bbd719dc 100755 --- a/configure +++ b/configure @@ -699,6 +699,9 @@ with_gnu_ld LD LDFLAGS_SL LDFLAGS_EX +ZSTD_LIBS +ZSTD_CFLAGS +with_zstd LZ4_LIBS LZ4_CFLAGS with_lz4 @@ -868,6 +871,7 @@ with_libxslt with_system_tzdata with_zlib with_lz4 +with_zstd with_gnu_ld with_ssl with_openssl @@ -897,6 +901,8 @@ XML2_CFLAGS XML2_LIBS LZ4_CFLAGS LZ4_LIBS +ZSTD_CFLAGS +ZSTD_LIBS LDFLAGS_EX LDFLAGS_SL PERL @@ -1576,6 +1582,7 @@ Optional Packages: use system time zone data in DIR --without-zlib do not use Zlib --without-lz4 build without LZ4 support + --with-zstd build with Zstd compression library --with-gnu-ld assume the C compiler uses GNU ld [default=no] --with-ssl=LIB use LIB for SSL/TLS support (openssl) --with-openssl obsolete spelling of --with-ssl=openssl @@ -1605,6 +1612,8 @@ Some influential environment variables: XML2_LIBS linker flags for XML2, overriding pkg-config LZ4_CFLAGS C compiler flags for LZ4, overriding pkg-config LZ4_LIBS linker flags for LZ4, overriding pkg-config + ZSTD_CFLAGS C compiler flags for ZSTD, overriding pkg-config + ZSTD_LIBS linker flags for ZSTD, overriding pkg-config LDFLAGS_EX extra linker flags for linking executables only LDFLAGS_SL extra linker flags for linking shared libraries only PERL Perl program @@ -8715,6 +8724,147 @@ fi done fi +# +# ZSTD +# +{ $as_echo "$as_me:${as_lineno-$LINENO}: checking whether to build with zstd support" >&5 +$as_echo_n "checking whether to build with zstd support... " >&6; } + + + +# Check whether --with-zstd was given. +if test "${with_zstd+set}" = set; then : + withval=$with_zstd; + case $withval in + yes) + +$as_echo "#define USE_ZSTD 1" >>confdefs.h + + ;; + no) + : + ;; + *) + as_fn_error $? "no argument expected for --with-zstd option" "$LINENO" 5 + ;; + esac + +else + with_zstd=no + +fi + + +{ $as_echo "$as_me:${as_lineno-$LINENO}: result: $with_zstd" >&5 +$as_echo "$with_zstd" >&6; } + + +if test "$with_zstd" = yes; then + +pkg_failed=no +{ $as_echo "$as_me:${as_lineno-$LINENO}: checking for libzstd" >&5 +$as_echo_n "checking for libzstd... " >&6; } + +if test -n "$ZSTD_CFLAGS"; then + pkg_cv_ZSTD_CFLAGS="$ZSTD_CFLAGS" + elif test -n "$PKG_CONFIG"; then + if test -n "$PKG_CONFIG" && \ + { { $as_echo "$as_me:${as_lineno-$LINENO}: \$PKG_CONFIG --exists --print-errors \"libzstd\""; } >&5 + ($PKG_CONFIG --exists --print-errors "libzstd") 2>&5 + ac_status=$? + $as_echo "$as_me:${as_lineno-$LINENO}: \$? = $ac_status" >&5 + test $ac_status = 0; }; then + pkg_cv_ZSTD_CFLAGS=`$PKG_CONFIG --cflags "libzstd" 2>/dev/null` + test "x$?" != "x0" && pkg_failed=yes +else + pkg_failed=yes +fi + else + pkg_failed=untried +fi +if test -n "$ZSTD_LIBS"; then + pkg_cv_ZSTD_LIBS="$ZSTD_LIBS" + elif test -n "$PKG_CONFIG"; then + if test -n "$PKG_CONFIG" && \ + { { $as_echo "$as_me:${as_lineno-$LINENO}: \$PKG_CONFIG --exists --print-errors \"libzstd\""; } >&5 + ($PKG_CONFIG --exists --print-errors "libzstd") 2>&5 + ac_status=$? + $as_echo "$as_me:${as_lineno-$LINENO}: \$? = $ac_status" >&5 + test $ac_status = 0; }; then + pkg_cv_ZSTD_LIBS=`$PKG_CONFIG --libs "libzstd" 2>/dev/null` + test "x$?" != "x0" && pkg_failed=yes +else + pkg_failed=yes +fi + else + pkg_failed=untried +fi + + + +if test $pkg_failed = yes; then + { $as_echo "$as_me:${as_lineno-$LINENO}: result: no" >&5 +$as_echo "no" >&6; } + +if $PKG_CONFIG --atleast-pkgconfig-version 0.20; then + _pkg_short_errors_supported=yes +else + _pkg_short_errors_supported=no +fi + if test $_pkg_short_errors_supported = yes; then + ZSTD_PKG_ERRORS=`$PKG_CONFIG --short-errors --print-errors --cflags --libs "libzstd" 2>&1` + else + ZSTD_PKG_ERRORS=`$PKG_CONFIG --print-errors --cflags --libs "libzstd" 2>&1` + fi + # Put the nasty error message in config.log where it belongs + echo "$ZSTD_PKG_ERRORS" >&5 + + as_fn_error $? "Package requirements (libzstd) were not met: + +$ZSTD_PKG_ERRORS + +Consider adjusting the PKG_CONFIG_PATH environment variable if you +installed software in a non-standard prefix. + +Alternatively, you may set the environment variables ZSTD_CFLAGS +and ZSTD_LIBS to avoid the need to call pkg-config. +See the pkg-config man page for more details." "$LINENO" 5 +elif test $pkg_failed = untried; then + { $as_echo "$as_me:${as_lineno-$LINENO}: result: no" >&5 +$as_echo "no" >&6; } + { { $as_echo "$as_me:${as_lineno-$LINENO}: error: in \`$ac_pwd':" >&5 +$as_echo "$as_me: error: in \`$ac_pwd':" >&2;} +as_fn_error $? "The pkg-config script could not be found or is too old. Make sure it +is in your PATH or set the PKG_CONFIG environment variable to the full +path to pkg-config. + +Alternatively, you may set the environment variables ZSTD_CFLAGS +and ZSTD_LIBS to avoid the need to call pkg-config. +See the pkg-config man page for more details. + +To get pkg-config, see <http://pkg-config.freedesktop.org/>. +See \`config.log' for more details" "$LINENO" 5; } +else + ZSTD_CFLAGS=$pkg_cv_ZSTD_CFLAGS + ZSTD_LIBS=$pkg_cv_ZSTD_LIBS + { $as_echo "$as_me:${as_lineno-$LINENO}: result: yes" >&5 +$as_echo "yes" >&6; } + +fi + # We only care about -I, -D, and -L switches; + # note that -lzstd will be added by AC_CHECK_LIB below. + for pgac_option in $ZSTD_CFLAGS; do + case $pgac_option in + -I*|-D*) CPPFLAGS="$CPPFLAGS $pgac_option";; + esac + done + for pgac_option in $ZSTD_LIBS; do + case $pgac_option in + -L*) LDFLAGS="$LDFLAGS $pgac_option";; + esac + done +fi + # # Assignments # @@ -12878,6 +13028,56 @@ fi fi +if test "$with_zstd" = yes ; then + { $as_echo "$as_me:${as_lineno-$LINENO}: checking for ZSTD_compress in -lzstd" >&5 +$as_echo_n "checking for ZSTD_compress in -lzstd... " >&6; } +if ${ac_cv_lib_zstd_ZSTD_compress+:} false; then : + $as_echo_n "(cached) " >&6 +else + ac_check_lib_save_LIBS=$LIBS +LIBS="-lzstd $LIBS" +cat confdefs.h - <<_ACEOF >conftest.$ac_ext +/* end confdefs.h. */ + +/* Override any GCC internal prototype to avoid an error. + Use char because int might match the return type of a GCC + builtin and then its argument prototype would still apply. */ +#ifdef __cplusplus +extern "C" +#endif +char ZSTD_compress (); +int +main () +{ +return ZSTD_compress (); + ; + return 0; +} +_ACEOF +if ac_fn_c_try_link "$LINENO"; then : + ac_cv_lib_zstd_ZSTD_compress=yes +else + ac_cv_lib_zstd_ZSTD_compress=no +fi +rm -f core conftest.err conftest.$ac_objext \ + conftest$ac_exeext conftest.$ac_ext +LIBS=$ac_check_lib_save_LIBS +fi +{ $as_echo "$as_me:${as_lineno-$LINENO}: result: $ac_cv_lib_zstd_ZSTD_compress" >&5 +$as_echo "$ac_cv_lib_zstd_ZSTD_compress" >&6; } +if test "x$ac_cv_lib_zstd_ZSTD_compress" = xyes; then : + cat >>confdefs.h <<_ACEOF +#define HAVE_LIBZSTD 1 +_ACEOF + + LIBS="-lzstd $LIBS" + +else + as_fn_error $? "library 'zstd' is required for ZSTD support" "$LINENO" 5 +fi + +fi + # Note: We can test for libldap_r only after we know PTHREAD_LIBS if test "$with_ldap" = yes ; then _LIBS="$LIBS" @@ -13600,6 +13800,23 @@ done fi +if test "$with_zstd" = yes; then + for ac_header in zstd.h +do : + ac_fn_c_check_header_mongrel "$LINENO" "zstd.h" "ac_cv_header_zstd_h" "$ac_includes_default" +if test "x$ac_cv_header_zstd_h" = xyes; then : + cat >>confdefs.h <<_ACEOF +#define HAVE_ZSTD_H 1 +_ACEOF + +else + as_fn_error $? "zstd.h header file is required for zstd" "$LINENO" 5 +fi + +done + +fi + if test "$with_gssapi" = yes ; then for ac_header in gssapi/gssapi.h do : diff --git a/configure.ac b/configure.ac index cb0261f179..c348a3ee91 100644 --- a/configure.ac +++ b/configure.ac @@ -1011,6 +1011,31 @@ if test "$with_lz4" = yes; then done fi +# +# ZSTD +# +AC_MSG_CHECKING([whether to build with zstd support]) +PGAC_ARG_BOOL(with, zstd, no, [build with Zstd compression library], + [AC_DEFINE([USE_ZSTD], 1, [Define to 1 to build with zstd support. (--with-zstd)])]) +AC_MSG_RESULT([$with_zstd]) +AC_SUBST(with_zstd) + +if test "$with_zstd" = yes; then + PKG_CHECK_MODULES(ZSTD, libzstd) + # We only care about -I, -D, and -L switches; + # note that -lzstd will be added by AC_CHECK_LIB below. + for pgac_option in $ZSTD_CFLAGS; do + case $pgac_option in + -I*|-D*) CPPFLAGS="$CPPFLAGS $pgac_option";; + esac + done + for pgac_option in $ZSTD_LIBS; do + case $pgac_option in + -L*) LDFLAGS="$LDFLAGS $pgac_option";; + esac + done +fi + # # Assignments # @@ -1285,6 +1310,10 @@ if test "$with_lz4" = yes ; then AC_CHECK_LIB(lz4, LZ4_compress_default, [], [AC_MSG_ERROR([library 'lz4' is required for LZ4 support])]) fi +if test "$with_zstd" = yes ; then + AC_CHECK_LIB(zstd, ZSTD_compress, [], [AC_MSG_ERROR([library 'zstd' is required for ZSTD support])]) +fi + # Note: We can test for libldap_r only after we know PTHREAD_LIBS if test "$with_ldap" = yes ; then _LIBS="$LIBS" @@ -1443,6 +1472,10 @@ if test "$with_lz4" = yes; then AC_CHECK_HEADERS(lz4.h, [], [AC_MSG_ERROR([lz4.h header file is required for LZ4])]) fi +if test "$with_zstd" = yes; then + AC_CHECK_HEADERS(zstd.h, [], [AC_MSG_ERROR([zstd.h header file is required for zstd])]) +fi + if test "$with_gssapi" = yes ; then AC_CHECK_HEADERS(gssapi/gssapi.h, [], [AC_CHECK_HEADERS(gssapi.h, [], [AC_MSG_ERROR([gssapi.h header file is required for GSSAPI])])]) diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml index df5ff70d91..ee4c44fb7f 100644 --- a/doc/src/sgml/config.sgml +++ b/doc/src/sgml/config.sgml @@ -3141,7 +3141,7 @@ include_dir 'conf.d' <xref linkend="guc-full-page-writes"/> is on or during a base backup. A compressed page image will be decompressed during WAL replay. The supported methods are pglz, zlib, and (if configured when - <productname>PostgreSQL</productname> was built) lz4. + <productname>PostgreSQL</productname> was built) lz4 and zstd. The default value is <literal>off</literal>. Only superusers can change this setting. </para> diff --git a/doc/src/sgml/installation.sgml b/doc/src/sgml/installation.sgml index c7673a4dc8..3e985bbd05 100644 --- a/doc/src/sgml/installation.sgml +++ b/doc/src/sgml/installation.sgml @@ -275,6 +275,14 @@ su - postgres </para> </listitem> + <listitem> + <para> + The <productname>ZSTD</productname> library can be used to enable + compression using that method; see + <xref linkend="guc-wal-compression"/>. + </para> + </listitem> + <listitem> <para> To build the <productname>PostgreSQL</productname> documentation, @@ -986,6 +994,17 @@ build-postgresql: </listitem> </varlistentry> + <varlistentry> + <term><option>--with-zstd</option></term> + <listitem> + <para> + Build with <productname>ZSTD</productname> compression support. + This enables use of <productname>ZSTD</productname> for + compression of WAL data. + </para> + </listitem> + </varlistentry> + <varlistentry> <term><option>--with-ssl=<replaceable>LIBRARY</replaceable></option> <indexterm> diff --git a/src/backend/access/transam/xloginsert.c b/src/backend/access/transam/xloginsert.c index a8794a941a..96f497d5d6 100644 --- a/src/backend/access/transam/xloginsert.c +++ b/src/backend/access/transam/xloginsert.c @@ -48,10 +48,17 @@ #define LZ4_MAX_BLCKSZ 0 #endif +#ifdef USE_ZSTD +#include "zstd.h" +#define ZSTD_MAX_BLCKSZ ZSTD_COMPRESSBOUND(BLCKSZ) +#else +#define ZSTD_MAX_BLCKSZ 0 +#endif + /* Buffer size required to store a compressed version of backup block image */ #define PGLZ_MAX_BLCKSZ PGLZ_MAX_OUTPUT(BLCKSZ) -#define COMPRESS_BUFSIZE Max(Max(PGLZ_MAX_BLCKSZ, ZLIB_MAX_BLCKSZ), LZ4_MAX_BLCKSZ) +#define COMPRESS_BUFSIZE Max(Max(Max(PGLZ_MAX_BLCKSZ, ZLIB_MAX_BLCKSZ), LZ4_MAX_BLCKSZ), ZSTD_MAX_BLCKSZ) /* * For each block reference registered with XLogRegisterBuffer, we fill in @@ -904,6 +911,15 @@ XLogCompressBackupBlock(char *page, uint16 hole_offset, uint16 hole_length, break; #endif +#ifdef USE_ZSTD + case WAL_COMPRESSION_ZSTD: + len = ZSTD_compress(dest, COMPRESS_BUFSIZE, source, orig_len, + ZSTD_CLEVEL_DEFAULT); + if (ZSTD_isError(len)) + len = -1; + break; +#endif + default: /* * It should be impossible to get here for unsupported algorithms, diff --git a/src/backend/access/transam/xlogreader.c b/src/backend/access/transam/xlogreader.c index e44817fece..1b13d1f660 100644 --- a/src/backend/access/transam/xlogreader.c +++ b/src/backend/access/transam/xlogreader.c @@ -42,6 +42,10 @@ #include "lz4.h" #endif +#ifdef USE_ZSTD +#include "zstd.h" +#endif + static void report_invalid_record(XLogReaderState *state, const char *fmt,...) pg_attribute_printf(2, 3); static bool allocate_recordbuf(XLogReaderState *state, uint32 reclength); @@ -83,6 +87,10 @@ const struct config_enum_entry wal_compression_options[] = { {"lz4", WAL_COMPRESSION_LZ4, false}, #endif +#ifdef USE_ZSTD + {"zstd", WAL_COMPRESSION_ZSTD, false}, +#endif + {NULL, 0, false} }; @@ -1639,6 +1647,16 @@ RestoreBlockImage(XLogReaderState *record, uint8 block_id, char *page) break; #endif +#ifdef USE_ZSTD + case WAL_COMPRESSION_ZSTD: + decomp_result = ZSTD_decompress(tmp.data, BLCKSZ-bkpb->hole_length, + ptr, bkpb->bimg_len); + // XXX: ZSTD_getErrorName + if (ZSTD_isError(decomp_result)) + decomp_result = -1; + break; +#endif + default: report_invalid_record(record, "image at %X/%X is compressed with unsupported codec, block %d (%d/%s)", (uint32) (record->ReadRecPtr >> 32), diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample index 728acef953..818b26faad 100644 --- a/src/backend/utils/misc/postgresql.conf.sample +++ b/src/backend/utils/misc/postgresql.conf.sample @@ -218,7 +218,7 @@ #full_page_writes = on # recover from partial page writes #wal_log_hints = off # also do full page writes of non-critical updates # (change requires restart) -#wal_compression = off # enable compression of full-page writes: off, pglz, zlib, lz4 +#wal_compression = off # enable compression of full-page writes: off, pglz, zlib, lz4, zstd #wal_init_zero = on # zero-fill new WAL files #wal_recycle = on # recycle WAL files #wal_buffers = -1 # min 32kB, -1 sets based on shared_buffers diff --git a/src/include/access/xlog_internal.h b/src/include/access/xlog_internal.h index 0287592cd4..1da965a708 100644 --- a/src/include/access/xlog_internal.h +++ b/src/include/access/xlog_internal.h @@ -331,6 +331,7 @@ typedef enum WalCompression WAL_COMPRESSION_PGLZ, WAL_COMPRESSION_ZLIB, WAL_COMPRESSION_LZ4, + WAL_COMPRESSION_ZSTD, } WalCompression; extern const char *wal_compression_name(WalCompression compression); diff --git a/src/include/access/xlogrecord.h b/src/include/access/xlogrecord.h index abb42b364d..84ffb17596 100644 --- a/src/include/access/xlogrecord.h +++ b/src/include/access/xlogrecord.h @@ -147,12 +147,13 @@ typedef struct XLogRecordBlockImageHeader #define BKPIMAGE_APPLY 0x02 /* page image should be restored during * replay */ #define BKPIMAGE_COMPRESS_METHOD1 0x04 /* bits to encode compression method */ -#define BKPIMAGE_COMPRESS_METHOD2 0x08 /* 0=none, 1=pglz, 2=zlib, 3=lz4 */ +#define BKPIMAGE_COMPRESS_METHOD2 0x08 /* 0=none, 1=pglz, 2=zlib, 3=lz4, 4=zstd */ +#define BKPIMAGE_COMPRESS_METHOD3 0x10 /* How many bits to shift to extract compression */ #define BKPIMAGE_COMPRESS_OFFSET_BITS 2 /* How many bits are for compression */ -#define BKPIMAGE_COMPRESS_BITS 2 +#define BKPIMAGE_COMPRESS_BITS 3 /* Extract the compression from the bimg_info */ #define BKPIMAGE_COMPRESSION(info) ((info >> BKPIMAGE_COMPRESS_OFFSET_BITS) & ((1<<BKPIMAGE_COMPRESS_BITS) - 1)) #define BKPIMAGE_IS_COMPRESSED(info) (BKPIMAGE_COMPRESSION(info) != 0) diff --git a/src/include/pg_config.h.in b/src/include/pg_config.h.in index 783b8fc1ba..bb44ef2a9d 100644 --- a/src/include/pg_config.h.in +++ b/src/include/pg_config.h.in @@ -917,6 +917,9 @@ /* Define to 1 to build with LZ4 support. (--with-lz4) */ #undef USE_LZ4 +/* Define to 1 if you have the `zstd' library (-lzstd). */ +#undef USE_ZSTD + /* Define to select named POSIX semaphores. */ #undef USE_NAMED_POSIX_SEMAPHORES diff --git a/src/tools/msvc/Solution.pm b/src/tools/msvc/Solution.pm index a7b8f720b5..28ff10f09f 100644 --- a/src/tools/msvc/Solution.pm +++ b/src/tools/msvc/Solution.pm @@ -494,6 +494,7 @@ sub GenerateFiles USE_LIBXML => undef, USE_LIBXSLT => undef, USE_LZ4 => undef, + USE_ZSTD => $self->{options}->{zstd} ? 1 : undef, USE_LDAP => $self->{options}->{ldap} ? 1 : undef, USE_LLVM => undef, USE_NAMED_POSIX_SEMAPHORES => undef, diff --git a/src/tools/msvc/config_default.pl b/src/tools/msvc/config_default.pl index 460c0375d4..b8a1aac3c2 100644 --- a/src/tools/msvc/config_default.pl +++ b/src/tools/msvc/config_default.pl @@ -26,6 +26,7 @@ our $config = { xslt => undef, # --with-libxslt=<path> iconv => undef, # (not in configure, path to iconv) zlib => undef # --with-zlib=<path> + zstd => undef # --with-zstd=<path> }; 1; -- 2.17.0
>From 3929203166e8fadcd33dbd46ffdf7522e1bf0151 Mon Sep 17 00:00:00 2001 From: Justin Pryzby <pryz...@telsasoft.com> Date: Fri, 12 Mar 2021 15:35:53 -0600 Subject: [PATCH v9 8/9] Default to zstd.. for CI, not for merge --- configure | 6 ++++-- configure.ac | 2 +- src/backend/access/transam/xlog.c | 2 +- src/backend/utils/misc/guc.c | 2 +- 4 files changed, 7 insertions(+), 5 deletions(-) diff --git a/configure b/configure index 72bbd719dc..b445db933e 100755 --- a/configure +++ b/configure @@ -1582,7 +1582,7 @@ Optional Packages: use system time zone data in DIR --without-zlib do not use Zlib --without-lz4 build without LZ4 support - --with-zstd build with Zstd compression library + --without-zstd build without Zstd compression library --with-gnu-ld assume the C compiler uses GNU ld [default=no] --with-ssl=LIB use LIB for SSL/TLS support (openssl) --with-openssl obsolete spelling of --with-ssl=openssl @@ -8750,7 +8750,9 @@ $as_echo "#define USE_ZSTD 1" >>confdefs.h esac else - with_zstd=no + with_zstd=yes + +$as_echo "#define USE_ZSTD 1" >>confdefs.h fi diff --git a/configure.ac b/configure.ac index c348a3ee91..f8ee35ebfd 100644 --- a/configure.ac +++ b/configure.ac @@ -1015,7 +1015,7 @@ fi # ZSTD # AC_MSG_CHECKING([whether to build with zstd support]) -PGAC_ARG_BOOL(with, zstd, no, [build with Zstd compression library], +PGAC_ARG_BOOL(with, zstd, yes, [build without Zstd compression library], [AC_DEFINE([USE_ZSTD], 1, [Define to 1 to build with zstd support. (--with-zstd)])]) AC_MSG_RESULT([$with_zstd]) AC_SUBST(with_zstd) diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c index f2b0af6360..599381337e 100644 --- a/src/backend/access/transam/xlog.c +++ b/src/backend/access/transam/xlog.c @@ -98,7 +98,7 @@ char *XLogArchiveCommand = NULL; bool EnableHotStandby = false; bool fullPageWrites = true; bool wal_log_hints = false; -int wal_compression = WAL_COMPRESSION_LZ4; +int wal_compression = WAL_COMPRESSION_ZSTD; char *wal_consistency_checking_string = NULL; bool *wal_consistency_checking = NULL; bool wal_init_zero = true; diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c index 0ad62e4d1f..f37251a27f 100644 --- a/src/backend/utils/misc/guc.c +++ b/src/backend/utils/misc/guc.c @@ -4822,7 +4822,7 @@ static struct config_enum ConfigureNamesEnum[] = NULL }, &wal_compression, - WAL_COMPRESSION_LZ4, wal_compression_options, + WAL_COMPRESSION_ZSTD, wal_compression_options, NULL, NULL, NULL }, -- 2.17.0
>From ecee845d049e8a7e939fe3b7bc9807ecc1b0a2c7 Mon Sep 17 00:00:00 2001 From: Justin Pryzby <pryz...@telsasoft.com> Date: Sun, 14 Mar 2021 17:12:07 -0500 Subject: [PATCH v9 9/9] Use GUC hooks to support compression 'level' --- src/backend/access/transam/xlog.c | 19 +++- src/backend/access/transam/xloginsert.c | 7 +- src/backend/access/transam/xlogreader.c | 120 ++++++++++++++++++++---- src/backend/utils/misc/guc.c | 20 ++-- src/include/access/xlog.h | 10 +- src/include/access/xlogreader.h | 2 + 6 files changed, 144 insertions(+), 34 deletions(-) diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c index 599381337e..4ec688a612 100644 --- a/src/backend/access/transam/xlog.c +++ b/src/backend/access/transam/xlog.c @@ -98,7 +98,9 @@ char *XLogArchiveCommand = NULL; bool EnableHotStandby = false; bool fullPageWrites = true; bool wal_log_hints = false; -int wal_compression = WAL_COMPRESSION_ZSTD; +char *wal_compression_string = ""; /* Overwritten by GUC */ +int wal_compression = WAL_COMPRESSION_ZSTD; +int wal_compression_level = 1; char *wal_consistency_checking_string = NULL; bool *wal_consistency_checking = NULL; bool wal_init_zero = true; @@ -10603,6 +10605,21 @@ assign_xlog_sync_method(int new_sync_method, void *extra) } } +bool +check_wal_compression(char **newval, void **extra, GucSource source) +{ + int tmp; + return get_compression_level(*newval, &tmp) != -1; +} + +/* Parse the GUC into integers for wal_compression and wal_compression_level */ +void +assign_wal_compression(const char *newval, void *extra) +{ + wal_compression = get_compression_level(newval, &wal_compression_level); + Assert(wal_compression >= 0); +} + /* * Issue appropriate kind of fsync (if any) for an XLOG output file. diff --git a/src/backend/access/transam/xloginsert.c b/src/backend/access/transam/xloginsert.c index 96f497d5d6..16dab2e5a6 100644 --- a/src/backend/access/transam/xloginsert.c +++ b/src/backend/access/transam/xloginsert.c @@ -895,7 +895,7 @@ XLogCompressBackupBlock(char *page, uint16 hole_offset, uint16 hole_length, { unsigned long len_l = COMPRESS_BUFSIZE; int ret; - ret = compress2((Bytef*)dest, &len_l, (Bytef*)source, orig_len, 1); + ret = compress2((Bytef*)dest, &len_l, (Bytef*)source, orig_len, wal_compression_level); if (ret != Z_OK) len_l = -1; len = len_l; @@ -905,7 +905,7 @@ XLogCompressBackupBlock(char *page, uint16 hole_offset, uint16 hole_length, #ifdef USE_LZ4 case WAL_COMPRESSION_LZ4: - len = LZ4_compress_fast(source, dest, orig_len, COMPRESS_BUFSIZE, 1); + len = LZ4_compress_fast(source, dest, orig_len, COMPRESS_BUFSIZE, wal_compression_level); if (len == 0) len = -1; break; @@ -913,8 +913,7 @@ XLogCompressBackupBlock(char *page, uint16 hole_offset, uint16 hole_length, #ifdef USE_ZSTD case WAL_COMPRESSION_ZSTD: - len = ZSTD_compress(dest, COMPRESS_BUFSIZE, source, orig_len, - ZSTD_CLEVEL_DEFAULT); + len = ZSTD_compress(dest, COMPRESS_BUFSIZE, source, orig_len, wal_compression_level); if (ZSTD_isError(len)) len = -1; break; diff --git a/src/backend/access/transam/xlogreader.c b/src/backend/access/transam/xlogreader.c index 1b13d1f660..a33f13b6cd 100644 --- a/src/backend/access/transam/xlogreader.c +++ b/src/backend/access/transam/xlogreader.c @@ -26,6 +26,7 @@ #include "catalog/pg_control.h" #include "common/pg_lzcompress.h" #include "replication/origin.h" +#include "utils/builtins.h" #include "utils/guc.h" #ifndef FRONTEND @@ -63,35 +64,41 @@ static void WALOpenSegmentInit(WALOpenSegment *seg, WALSegmentContext *segcxt, /* size of the buffer allocated for error message. */ #define MAX_ERRORMSG_LEN 1000 -/* - * Accept the likely variants for none and pglz, for compatibility with old - * server versions where wal_compression was a boolean. - */ -const struct config_enum_entry wal_compression_options[] = { +static const struct { + char *name; + enum WalCompression compress_id; /* The internal ID (which includes the compression level) */ + bool has_level; /* If it accepts a numeric "level" */ + int min_level, dfl_level, max_level; +} wal_compression_options[] = { + /* + * Accept the likely variants for none and pglz, for compatibility with old + * server versions where wal_compression was a boolean. + */ {"off", WAL_COMPRESSION_NONE, false}, {"none", WAL_COMPRESSION_NONE, false}, - {"false", WAL_COMPRESSION_NONE, true}, - {"no", WAL_COMPRESSION_NONE, true}, - {"0", WAL_COMPRESSION_NONE, true}, + {"false", WAL_COMPRESSION_NONE, false}, + {"no", WAL_COMPRESSION_NONE, false}, + {"0", WAL_COMPRESSION_NONE, false}, + {"pglz", WAL_COMPRESSION_PGLZ, false}, - {"true", WAL_COMPRESSION_PGLZ, true}, - {"yes", WAL_COMPRESSION_PGLZ, true}, - {"on", WAL_COMPRESSION_PGLZ, true}, - {"1", WAL_COMPRESSION_PGLZ, true}, + {"true", WAL_COMPRESSION_PGLZ, false}, + {"yes", WAL_COMPRESSION_PGLZ, false}, + {"on", WAL_COMPRESSION_PGLZ, false}, + {"1", WAL_COMPRESSION_PGLZ, false}, #ifdef HAVE_LIBZ - {"zlib", WAL_COMPRESSION_ZLIB, false}, + {"zlib", WAL_COMPRESSION_ZLIB, true, 0, 1, 9}, #endif #ifdef USE_LZ4 - {"lz4", WAL_COMPRESSION_LZ4, false}, + {"lz4", WAL_COMPRESSION_LZ4, true, 1, 1, 65537}, #endif #ifdef USE_ZSTD - {"zstd", WAL_COMPRESSION_ZSTD, false}, + {"zstd-fast", WAL_COMPRESSION_ZSTD, true, -50, -10, -1 }, /* Must be before zstd... */ + {"zstd", WAL_COMPRESSION_ZSTD, true, -50, ZSTD_CLEVEL_DEFAULT, 10}, #endif - {NULL, 0, false} }; /* @@ -1578,6 +1585,83 @@ XLogRecGetBlockData(XLogReaderState *record, uint8 block_id, Size *len) } } +/* + * Return the wal compression ID, or -1 if the input is + * invalid/unrecognized/unsupported. + * The compression level is stored in *level. + */ +int +get_compression_level(const char *in, int *level) +{ + for (int idx=0; idx < lengthof(wal_compression_options); ++idx) + { + int len; + int tmp; + char *end; + + if (strcmp(in, wal_compression_options[idx].name) == 0) + { + /* it has no -level suffix */ + *level = wal_compression_options[idx].dfl_level; + return wal_compression_options[idx].compress_id; + } + + len = strlen(wal_compression_options[idx].name); + if (strncmp(in, wal_compression_options[idx].name, len) != 0) + continue; + if (in[len] != '-') + continue; + + /* it has a -level suffix, but level is not allowed */ + if (!wal_compression_options[idx].has_level) + { +#ifndef FRONTEND + GUC_check_errdetail("Compression method does not accept a compression level"); +#endif + return -1; + } + + in += len + 1; + len = strlen(in); + errno = 0; + tmp = strtol(in, &end, 0); + if (end != in+len || end == in || + (errno != 0 && tmp == 0) || + (errno == ERANGE && (tmp == LONG_MIN || tmp == LONG_MAX))) + { +#ifndef FRONTEND + GUC_check_errdetail("Could not parse compression level: %s", in); +#endif + return -1; + } + + /* + * For convenience, allow specification of zstd-fast-N, which is + * interpretted as a negative compression level. + */ + if (strncmp(wal_compression_options[idx].name, "zstd-fast", 9) == 0 && + tmp > 0) + tmp = -tmp; + + if (tmp < wal_compression_options[idx].min_level || + tmp > wal_compression_options[idx].max_level) + { +#ifndef FRONTEND + GUC_check_errdetail("Compression level is outside of allowed range: %d...%d", + wal_compression_options[idx].min_level, + wal_compression_options[idx].max_level); +#endif + return -1; + } + + *level = tmp; + return wal_compression_options[idx].compress_id; + } + + return -1; +} + + /* * Return a statically allocated string associated with the given compression * method. @@ -1585,9 +1669,9 @@ XLogRecGetBlockData(XLogReaderState *record, uint8 block_id, Size *len) const char * wal_compression_name(WalCompression compression) { - for (int i=0; wal_compression_options[i].name != NULL; ++i) + for (int i=0; i < lengthof(wal_compression_options); ++i) { - if (wal_compression_options[i].val == compression) + if (wal_compression_options[i].compress_id == compression) return wal_compression_options[i].name; } diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c index f37251a27f..29cb12c8a5 100644 --- a/src/backend/utils/misc/guc.c +++ b/src/backend/utils/misc/guc.c @@ -4556,6 +4556,16 @@ static struct config_string ConfigureNamesString[] = check_wal_consistency_checking, assign_wal_consistency_checking, NULL }, + { + {"wal_compression", PGC_SUSET, WAL_SETTINGS, + gettext_noop("Set the method used to compress full page images in the WAL."), + NULL + }, + &wal_compression_string, + "zstd", + check_wal_compression, assign_wal_compression, NULL + }, + { {"jit_provider", PGC_POSTMASTER, CLIENT_CONN_PRELOAD, gettext_noop("JIT provider to use."), @@ -4816,16 +4826,6 @@ static struct config_enum ConfigureNamesEnum[] = NULL, NULL, NULL }, - { - {"wal_compression", PGC_SUSET, WAL_SETTINGS, - gettext_noop("Set the method used to compress full page images in the WAL."), - NULL - }, - &wal_compression, - WAL_COMPRESSION_ZSTD, wal_compression_options, - NULL, NULL, NULL - }, - { {"dynamic_shared_memory_type", PGC_POSTMASTER, RESOURCES_MEM, gettext_noop("Selects the dynamic shared memory implementation used."), diff --git a/src/include/access/xlog.h b/src/include/access/xlog.h index e8b2c53784..7a05838d0b 100644 --- a/src/include/access/xlog.h +++ b/src/include/access/xlog.h @@ -19,6 +19,7 @@ #include "lib/stringinfo.h" #include "nodes/pg_list.h" #include "storage/fd.h" +#include "utils/guc.h" /* Sync methods */ @@ -116,7 +117,6 @@ extern char *XLogArchiveCommand; extern bool EnableHotStandby; extern bool fullPageWrites; extern bool wal_log_hints; -extern int wal_compression; extern bool wal_init_zero; extern bool wal_recycle; extern bool *wal_consistency_checking; @@ -143,6 +143,9 @@ extern char *PromoteTriggerFile; extern RecoveryTargetTimeLineGoal recoveryTargetTimeLineGoal; extern TimeLineID recoveryTargetTLIRequested; extern TimeLineID recoveryTargetTLI; +extern char *wal_compression_string; +extern int wal_compression; +extern int wal_compression_level; extern int CheckPointSegments; @@ -361,6 +364,11 @@ extern void XLogRequestWalReceiverReply(void); extern void assign_max_wal_size(int newval, void *extra); extern void assign_checkpoint_completion_target(double newval, void *extra); +/* GUC */ +extern bool check_wal_compression(char **newval, void **extra, GucSource source); +extern void assign_wal_compression(const char *newval, void *extra); + + /* * Routines to start, stop, and get status of a base backup. */ diff --git a/src/include/access/xlogreader.h b/src/include/access/xlogreader.h index 21d200d3df..b4d0ab4517 100644 --- a/src/include/access/xlogreader.h +++ b/src/include/access/xlogreader.h @@ -327,4 +327,6 @@ extern bool XLogRecGetBlockTag(XLogReaderState *record, uint8 block_id, RelFileNode *rnode, ForkNumber *forknum, BlockNumber *blknum); +extern int get_compression_level(const char *in, int *level); + #endif /* XLOGREADER_H */ -- 2.17.0