Hi,
On 02/08/2019 21:48, Tomas Vondra wrote:
On Fri, Aug 02, 2019 at 11:20:03AM -0700, Andres Freund wrote:
Another question is whether we'd actually want to include the code in
core directly, or use system libraries (and if some packagers might
decide to disable that, for whatever reason).
I'd personally say we should have an included version, and a
--with-system-... flag that uses the system one.
OK. I'd say to require a system library, but that's a minor detail.
Same here.
Just so that we don't idly talk, what do you think about the attached?
It:
- adds new GUC compression_algorithm with possible values of pglz
(default) and lz4 (if lz4 is compiled in), requires SIGHUP
- adds --with-lz4 configure option (default yes, so the configure option
is actually --without-lz4) that enables the lz4, it's using system library
- uses the compression_algorithm for both TOAST and WAL compression (if on)
- supports slicing for lz4 as well (pglz was already supported)
- supports reading old TOAST values
- adds 1 byte header to the compressed data where we currently store the
algorithm kind, that leaves us with 254 more to add :) (that's an extra
overhead compared to the current state)
- changes the rawsize in TOAST header to 31 bits via bit packing
- uses the extra bit to differentiate between old and new format
- supports reading from table which has different rows stored with
different algorithm (so that the GUC itself can be freely changed)
Simple docs and a TAP test included.
I did some basic performance testing (it's not really my thing though,
so I would appreciate if somebody did more).
I get about 7x perf improvement on data load with lz4 compared to pglz
on my dataset but strangely only tiny decompression improvement. Perhaps
more importantly I also did before patch and after patch tests with pglz
and the performance difference with my data set was <1%.
Note that this will just link against lz4, it does not add lz4 into
PostgreSQL code-base.
The issues I know of:
- the pg_decompress function really ought to throw error in the default
branch but that file is also used in front-end so not sure how to do that
- the TAP test probably does not work with all possible configurations
(but that's why it needs to be set in PG_TEST_EXTRA like for example ssl)
- we don't really have any automated test for reading old TOAST format,
no idea how to do that
- I expect my changes to configure.in are not the greatest as I don't
have pretty much zero experience with autoconf
--
Petr Jelinek
2ndQuadrant - PostgreSQL Solutions for the Enterprise
https://www.2ndQuadrant.com/
>From 862919a271e6bfa151fde2fed19c9e5ae6227cdc Mon Sep 17 00:00:00 2001
From: Petr Jelinek <pjmo...@pjmodos.net>
Date: Sun, 4 Aug 2019 02:02:30 +0200
Subject: [PATCH] Add new GUC compression_algorithm
Sets which algorithm to use for TOAST and WAL compression.
Currently allows either pglz which is the standard PostgreSQL algorithm
or lz4 if the PostgreSQL was configured with --with-lz4 (which is the
default).
The implementation allows different values to have different compression
algorithms and also supports reading old TOAST format which always uses
the pglz.
---
configure | 116 ++++++++++++++++++--
configure.in | 19 ++++
doc/src/sgml/config.sgml | 38 +++++++
doc/src/sgml/storage.sgml | 5 +-
src/Makefile.global.in | 1 +
src/backend/access/heap/tuptoaster.c | 86 +++++++++++----
src/backend/access/transam/xloginsert.c | 5 +-
src/backend/utils/misc/guc.c | 23 +++-
src/common/pg_lzcompress.c | 134 +++++++++++++++++++++++-
src/include/common/pg_lzcompress.h | 49 +++++++--
src/include/pg_config.h.in | 3 +
src/include/postgres.h | 3 +-
src/test/Makefile | 8 +-
src/test/toast/.gitignore | 2 +
src/test/toast/Makefile | 25 +++++
src/test/toast/README | 25 +++++
src/test/toast/t/001_lz4.pl | 124 ++++++++++++++++++++++
17 files changed, 619 insertions(+), 47 deletions(-)
create mode 100644 src/test/toast/.gitignore
create mode 100644 src/test/toast/Makefile
create mode 100644 src/test/toast/README
create mode 100644 src/test/toast/t/001_lz4.pl
diff --git a/configure b/configure
index 7a6bfc2339..edd2bfefd6 100755
--- a/configure
+++ b/configure
@@ -704,6 +704,7 @@ with_system_tzdata
with_libxslt
with_libxml
XML2_CONFIG
+with_lz4
UUID_EXTRA_OBJS
with_uuid
with_systemd
@@ -795,6 +796,7 @@ infodir
docdir
oldincludedir
includedir
+runstatedir
localstatedir
sharedstatedir
sysconfdir
@@ -859,6 +861,7 @@ with_readline
with_libedit_preferred
with_uuid
with_ossp_uuid
+with_lz4
with_libxml
with_libxslt
with_system_tzdata
@@ -932,6 +935,7 @@ datadir='${datarootdir}'
sysconfdir='${prefix}/etc'
sharedstatedir='${prefix}/com'
localstatedir='${prefix}/var'
+runstatedir='${localstatedir}/run'
includedir='${prefix}/include'
oldincludedir='/usr/include'
docdir='${datarootdir}/doc/${PACKAGE_TARNAME}'
@@ -1184,6 +1188,15 @@ do
| -silent | --silent | --silen | --sile | --sil)
silent=yes ;;
+ -runstatedir | --runstatedir | --runstatedi | --runstated \
+ | --runstate | --runstat | --runsta | --runst | --runs \
+ | --run | --ru | --r)
+ ac_prev=runstatedir ;;
+ -runstatedir=* | --runstatedir=* | --runstatedi=* | --runstated=* \
+ | --runstate=* | --runstat=* | --runsta=* | --runst=* | --runs=* \
+ | --run=* | --ru=* | --r=*)
+ runstatedir=$ac_optarg ;;
+
-sbindir | --sbindir | --sbindi | --sbind | --sbin | --sbi | --sb)
ac_prev=sbindir ;;
-sbindir=* | --sbindir=* | --sbindi=* | --sbind=* | --sbin=* \
@@ -1321,7 +1334,7 @@ fi
for ac_var in exec_prefix prefix bindir sbindir libexecdir datarootdir \
datadir sysconfdir sharedstatedir localstatedir includedir \
oldincludedir docdir infodir htmldir dvidir pdfdir psdir \
- libdir localedir mandir
+ libdir localedir mandir runstatedir
do
eval ac_val=\$$ac_var
# Remove trailing slashes.
@@ -1474,6 +1487,7 @@ Fine tuning of the installation directories:
--sysconfdir=DIR read-only single-machine data [PREFIX/etc]
--sharedstatedir=DIR modifiable architecture-independent data [PREFIX/com]
--localstatedir=DIR modifiable single-machine data [PREFIX/var]
+ --runstatedir=DIR modifiable per-process data [LOCALSTATEDIR/run]
--libdir=DIR object code libraries [EPREFIX/lib]
--includedir=DIR C header files [PREFIX/include]
--oldincludedir=DIR C header files for non-gcc [/usr/include]
@@ -1564,6 +1578,7 @@ Optional Packages:
prefer BSD Libedit over GNU Readline
--with-uuid=LIB build contrib/uuid-ossp using LIB (bsd,e2fs,ossp)
--with-ossp-uuid obsolete spelling of --with-uuid=ossp
+ --without-lz4 do not build with LZ4 support
--with-libxml build with XML support
--with-libxslt use XSLT support when building contrib/xml2
--with-system-tzdata=DIR
@@ -8115,6 +8130,34 @@ fi
+#
+# LZ4
+#
+
+
+
+# Check whether --with-lz4 was given.
+if test "${with_lz4+set}" = set; then :
+ withval=$with_lz4;
+ case $withval in
+ yes)
+ :
+ ;;
+ no)
+ :
+ ;;
+ *)
+ as_fn_error $? "no argument expected for --with-lz4 option" "$LINENO" 5
+ ;;
+ esac
+
+else
+ with_lz4=yes
+
+fi
+
+
+
#
# XML
@@ -12661,6 +12704,55 @@ fi
fi
+# for lz4 compression support
+if test "$with_lz4" = yes ; then
+ { $as_echo "$as_me:${as_lineno-$LINENO}: checking for LZ4_sizeofState in -llz4" >&5
+$as_echo_n "checking for LZ4_sizeofState in -llz4... " >&6; }
+if ${ac_cv_lib_lz4_LZ4_sizeofState+:} false; then :
+ $as_echo_n "(cached) " >&6
+else
+ ac_check_lib_save_LIBS=$LIBS
+LIBS="-llz4 $LIBS"
+cat confdefs.h - <<_ACEOF >conftest.$ac_ext
+/* end confdefs.h. */
+
+/* Override any GCC internal prototype to avoid an error.
+ Use char because int might match the return type of a GCC
+ builtin and then its argument prototype would still apply. */
+#ifdef __cplusplus
+extern "C"
+#endif
+char LZ4_sizeofState ();
+int
+main ()
+{
+return LZ4_sizeofState ();
+ ;
+ return 0;
+}
+_ACEOF
+if ac_fn_c_try_link "$LINENO"; then :
+ ac_cv_lib_lz4_LZ4_sizeofState=yes
+else
+ ac_cv_lib_lz4_LZ4_sizeofState=no
+fi
+rm -f core conftest.err conftest.$ac_objext \
+ conftest$ac_exeext conftest.$ac_ext
+LIBS=$ac_check_lib_save_LIBS
+fi
+{ $as_echo "$as_me:${as_lineno-$LINENO}: result: $ac_cv_lib_lz4_LZ4_sizeofState" >&5
+$as_echo "$ac_cv_lib_lz4_LZ4_sizeofState" >&6; }
+if test "x$ac_cv_lib_lz4_LZ4_sizeofState" = xyes; then :
+ COMPRESSION_LIBS=" -llz4"
+else
+ as_fn_error $? "library 'lz4' is required for LZ4 compression support" "$LINENO" 5
+fi
+
+
+$as_echo "#define HAVE_LZ4 1" >>confdefs.h
+
+fi
+LIBS="$LIBS$COMPRESSION_LIBS"
##
## Header files
@@ -13340,6 +13432,18 @@ fi
done
+fi
+
+# for lz4 compression support
+if test "$with_lz4" = yes ; then
+ ac_fn_c_check_header_mongrel "$LINENO" "lz4.h" "ac_cv_header_lz4_h" "$ac_includes_default"
+if test "x$ac_cv_header_lz4_h" = xyes; then :
+
+else
+ as_fn_error $? "header file <lz4.h> is required for LZ4 support" "$LINENO" 5
+fi
+
+
fi
if test "$PORTNAME" = "win32" ; then
@@ -14683,7 +14787,7 @@ else
We can't simply define LARGE_OFF_T to be 9223372036854775807,
since some C++ compilers masquerading as C compilers
incorrectly reject 9223372036854775807. */
-#define LARGE_OFF_T (((off_t) 1 << 62) - 1 + ((off_t) 1 << 62))
+#define LARGE_OFF_T ((((off_t) 1 << 31) << 31) - 1 + (((off_t) 1 << 31) << 31))
int off_t_is_large[(LARGE_OFF_T % 2147483629 == 721
&& LARGE_OFF_T % 2147483647 == 1)
? 1 : -1];
@@ -14729,7 +14833,7 @@ else
We can't simply define LARGE_OFF_T to be 9223372036854775807,
since some C++ compilers masquerading as C compilers
incorrectly reject 9223372036854775807. */
-#define LARGE_OFF_T (((off_t) 1 << 62) - 1 + ((off_t) 1 << 62))
+#define LARGE_OFF_T ((((off_t) 1 << 31) << 31) - 1 + (((off_t) 1 << 31) << 31))
int off_t_is_large[(LARGE_OFF_T % 2147483629 == 721
&& LARGE_OFF_T % 2147483647 == 1)
? 1 : -1];
@@ -14753,7 +14857,7 @@ rm -f core conftest.err conftest.$ac_objext conftest.$ac_ext
We can't simply define LARGE_OFF_T to be 9223372036854775807,
since some C++ compilers masquerading as C compilers
incorrectly reject 9223372036854775807. */
-#define LARGE_OFF_T (((off_t) 1 << 62) - 1 + ((off_t) 1 << 62))
+#define LARGE_OFF_T ((((off_t) 1 << 31) << 31) - 1 + (((off_t) 1 << 31) << 31))
int off_t_is_large[(LARGE_OFF_T % 2147483629 == 721
&& LARGE_OFF_T % 2147483647 == 1)
? 1 : -1];
@@ -14798,7 +14902,7 @@ else
We can't simply define LARGE_OFF_T to be 9223372036854775807,
since some C++ compilers masquerading as C compilers
incorrectly reject 9223372036854775807. */
-#define LARGE_OFF_T (((off_t) 1 << 62) - 1 + ((off_t) 1 << 62))
+#define LARGE_OFF_T ((((off_t) 1 << 31) << 31) - 1 + (((off_t) 1 << 31) << 31))
int off_t_is_large[(LARGE_OFF_T % 2147483629 == 721
&& LARGE_OFF_T % 2147483647 == 1)
? 1 : -1];
@@ -14822,7 +14926,7 @@ rm -f core conftest.err conftest.$ac_objext conftest.$ac_ext
We can't simply define LARGE_OFF_T to be 9223372036854775807,
since some C++ compilers masquerading as C compilers
incorrectly reject 9223372036854775807. */
-#define LARGE_OFF_T (((off_t) 1 << 62) - 1 + ((off_t) 1 << 62))
+#define LARGE_OFF_T ((((off_t) 1 << 31) << 31) - 1 + (((off_t) 1 << 31) << 31))
int off_t_is_large[(LARGE_OFF_T % 2147483629 == 721
&& LARGE_OFF_T % 2147483647 == 1)
? 1 : -1];
diff --git a/configure.in b/configure.in
index dde3eec89f..393a820cc2 100644
--- a/configure.in
+++ b/configure.in
@@ -915,6 +915,12 @@ fi
AC_SUBST(with_uuid)
AC_SUBST(UUID_EXTRA_OBJS)
+#
+# LZ4
+#
+PGAC_ARG_BOOL(with, lz4, yes,
+ [do not build with LZ4 support])
+AC_SUBST(with_lz4)
#
# XML
@@ -1263,6 +1269,14 @@ elif test "$with_uuid" = ossp ; then
fi
AC_SUBST(UUID_LIBS)
+# for lz4 compression support
+if test "$with_lz4" = yes ; then
+ AC_CHECK_LIB(lz4, LZ4_sizeofState,
+ [COMPRESSION_LIBS=" -llz4"],
+ [AC_MSG_ERROR([library 'lz4' is required for LZ4 compression support])])
+ AC_DEFINE([HAVE_LZ4], 1, [Define to 1 to build with LZ4 support])
+fi
+LIBS="$LIBS$COMPRESSION_LIBS"
##
## Header files
@@ -1443,6 +1457,11 @@ elif test "$with_uuid" = ossp ; then
[AC_MSG_ERROR([header file <ossp/uuid.h> or <uuid.h> is required for OSSP UUID])])])
fi
+# for lz4 compression support
+if test "$with_lz4" = yes ; then
+ AC_CHECK_HEADER(lz4.h, [], [AC_MSG_ERROR([header file <lz4.h> is required for LZ4 support])])
+fi
+
if test "$PORTNAME" = "win32" ; then
AC_CHECK_HEADERS(crtdefs.h)
fi
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index c91e3e1550..81bd26b653 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -1793,6 +1793,42 @@ include_dir 'conf.d'
<title>Disk</title>
<variablelist>
+
+ <varlistentry id="guc-compression-algorithm" xreflabel="compression_algorithm">
+ <term><varname>compression_algorithm</varname> (<type>enum</type>)
+ <indexterm>
+ <primary><varname>compression_algorithm</varname> configuration parameter</primary>
+ </indexterm>
+ </term>
+ <listitem>
+ <para>
+ Which compression algorithm to use for compressing
+ <acronym>TOAST</acronym> data and when
+ <xref linkend="guc-wal-compression"/> is turned on also for
+ <acronym>WAL</acronym>.
+ Possible values are:
+ </para>
+ <itemizedlist>
+ <listitem>
+ <para>
+ <literal>pglz</literal> (the internal PostgreSQL LZ family compression algorithm)
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ <literal>lz4</literal> (the LZ4 compression algorithm)
+ </para>
+ </listitem>
+ </itemizedlist>
+ <para>
+ Not all of these choices are available on all platforms.
+ The default is the pglz algorithm which is the one used by PostgrSQL
+ version 12 and earlier.
+ Only superusers can change this setting.
+ </para>
+ </listitem>
+ </varlistentry>
+
<varlistentry id="guc-temp-file-limit" xreflabel="temp_file_limit">
<term><varname>temp_file_limit</varname> (<type>integer</type>)
<indexterm>
@@ -2728,6 +2764,8 @@ include_dir 'conf.d'
<xref linkend="guc-full-page-writes"/> is on or during a base backup.
A compressed page image will be decompressed during WAL replay.
The default value is <literal>off</literal>.
+ The compression used is the one specified by
+ <xref linkend="guc-compression-algorithm"/>
Only superusers can change this setting.
</para>
diff --git a/doc/src/sgml/storage.sgml b/doc/src/sgml/storage.sgml
index 1047c77a63..306a0bb3d3 100644
--- a/doc/src/sgml/storage.sgml
+++ b/doc/src/sgml/storage.sgml
@@ -394,9 +394,8 @@ Further details appear in <xref linkend="storage-toast-inmemory"/>.
<para>
The compression technique used for either in-line or out-of-line compressed
-data is a fairly simple and very fast member
-of the LZ family of compression techniques. See
-<filename>src/common/pg_lzcompress.c</filename> for the details.
+data is chosen based on the <xref linkend="guc-compression-algorithm"/>
+setting.
</para>
<sect2 id="storage-toast-ondisk">
diff --git a/src/Makefile.global.in b/src/Makefile.global.in
index dc3f207e1c..8c3d38db1c 100644
--- a/src/Makefile.global.in
+++ b/src/Makefile.global.in
@@ -195,6 +195,7 @@ with_libxslt = @with_libxslt@
with_llvm = @with_llvm@
with_system_tzdata = @with_system_tzdata@
with_uuid = @with_uuid@
+with_lz4 = @with_lz4@
with_zlib = @with_zlib@
enable_rpath = @enable_rpath@
enable_nls = @enable_nls@
diff --git a/src/backend/access/heap/tuptoaster.c b/src/backend/access/heap/tuptoaster.c
index 74233bb931..14b64936ab 100644
--- a/src/backend/access/heap/tuptoaster.c
+++ b/src/backend/access/heap/tuptoaster.c
@@ -52,7 +52,20 @@
typedef struct toast_compress_header
{
int32 vl_len_; /* varlena header (do not touch directly!) */
- int32 rawsize;
+ /*
+ * The length cannot be more than 1GB due to general toast limitations
+ * we have the 2 high bits to encode aditional information.
+ *
+ * We use the last (highest) bit to mark this toast as the "new
+ * compression format" as the new pg_compress has it's own header
+ * which and the original pglz format is not distiguishable in any
+ * way from the format used by pg_compress. Thanks to this information
+ * the toast_decompress_datum can pick to either directly use
+ * pglz_decompress directly when dealing with data writen by older
+ * versions of postgres or let pg_decompress to autodetect format.
+ */
+ int32 rawsize:31;
+ uint32 cformat:1;
} toast_compress_header;
/*
@@ -61,10 +74,13 @@ typedef struct toast_compress_header
*/
#define TOAST_COMPRESS_HDRSZ ((int32) sizeof(toast_compress_header))
#define TOAST_COMPRESS_RAWSIZE(ptr) (((toast_compress_header *) (ptr))->rawsize)
+#define TOAST_COMPRESS_CFORMAT(ptr) (((toast_compress_header *) (ptr))->cformat)
#define TOAST_COMPRESS_RAWDATA(ptr) \
(((char *) (ptr)) + TOAST_COMPRESS_HDRSZ)
#define TOAST_COMPRESS_SET_RAWSIZE(ptr, len) \
(((toast_compress_header *) (ptr))->rawsize = (len))
+#define TOAST_COMPRESS_SET_CFORMAT(ptr, fmt) \
+ (((toast_compress_header *) (ptr))->cformat = (fmt))
static void toast_delete_datum(Relation rel, Datum value, bool is_speculative);
static Datum toast_save_datum(Relation rel, Datum value,
@@ -385,7 +401,7 @@ toast_raw_datum_size(Datum value)
else if (VARATT_IS_COMPRESSED(attr))
{
/* here, va_rawsize is just the payload size */
- result = VARRAWSIZE_4B_C(attr) + VARHDRSZ;
+ result = TOAST_COMPRESS_RAWSIZE(attr) + VARHDRSZ;
}
else if (VARATT_IS_SHORT(attr))
{
@@ -1363,6 +1379,7 @@ toast_compress_datum(Datum value)
{
struct varlena *tmp;
int32 valsize = VARSIZE_ANY_EXHDR(DatumGetPointer(value));
+ int32 buffer_capacity;
int32 len;
Assert(!VARATT_IS_EXTERNAL(DatumGetPointer(value)));
@@ -1376,11 +1393,11 @@ toast_compress_datum(Datum value)
valsize > PGLZ_strategy_default->max_input_size)
return PointerGetDatum(NULL);
- tmp = (struct varlena *) palloc(PGLZ_MAX_OUTPUT(valsize) +
- TOAST_COMPRESS_HDRSZ);
+ buffer_capacity = pg_compress_bound(valsize);
+ tmp = (struct varlena *) palloc(buffer_capacity + TOAST_COMPRESS_HDRSZ);
/*
- * We recheck the actual size even if pglz_compress() reports success,
+ * We recheck the actual size even if pg_compress() reports success,
* because it might be satisfied with having saved as little as one byte
* in the compressed data --- which could turn into a net loss once you
* consider header and alignment padding. Worst case, the compressed
@@ -1389,14 +1406,17 @@ toast_compress_datum(Datum value)
* only one header byte and no padding if the value is short enough. So
* we insist on a savings of more than 2 bytes to ensure we have a gain.
*/
- len = pglz_compress(VARDATA_ANY(DatumGetPointer(value)),
- valsize,
- TOAST_COMPRESS_RAWDATA(tmp),
- PGLZ_strategy_default);
+ len = pg_compress(VARDATA_ANY(DatumGetPointer(value)),
+ valsize,
+ TOAST_COMPRESS_RAWDATA(tmp),
+ buffer_capacity,
+ PGLZ_strategy_default);
+
if (len >= 0 &&
len + TOAST_COMPRESS_HDRSZ < valsize - 2)
{
TOAST_COMPRESS_SET_RAWSIZE(tmp, valsize);
+ TOAST_COMPRESS_SET_CFORMAT(tmp, 1);
SET_VARSIZE_COMPRESSED(tmp, len + TOAST_COMPRESS_HDRSZ);
/* successful compression */
return PointerGetDatum(tmp);
@@ -1520,7 +1540,7 @@ toast_save_datum(Relation rel, Datum value,
data_p = VARDATA(dval);
data_todo = VARSIZE(dval) - VARHDRSZ;
/* rawsize in a compressed datum is just the size of the payload */
- toast_pointer.va_rawsize = VARRAWSIZE_4B_C(dval) + VARHDRSZ;
+ toast_pointer.va_rawsize = TOAST_COMPRESS_RAWSIZE(dval) + VARHDRSZ;
toast_pointer.va_extsize = data_todo;
/* Assert that the numbers look like it's compressed */
Assert(VARATT_EXTERNAL_IS_COMPRESSED(toast_pointer));
@@ -2277,18 +2297,40 @@ static struct varlena *
toast_decompress_datum(struct varlena *attr)
{
struct varlena *result;
+ uint32 compression_format;
+ int32 raw_size;
Assert(VARATT_IS_COMPRESSED(attr));
- result = (struct varlena *)
- palloc(TOAST_COMPRESS_RAWSIZE(attr) + VARHDRSZ);
- SET_VARSIZE(result, TOAST_COMPRESS_RAWSIZE(attr) + VARHDRSZ);
+ raw_size = TOAST_COMPRESS_RAWSIZE(attr);
- if (pglz_decompress(TOAST_COMPRESS_RAWDATA(attr),
- VARSIZE(attr) - TOAST_COMPRESS_HDRSZ,
- VARDATA(result),
- TOAST_COMPRESS_RAWSIZE(attr), true) < 0)
- elog(ERROR, "compressed data is corrupted");
+ compression_format = TOAST_COMPRESS_CFORMAT(attr);
+
+ result = (struct varlena *) palloc(raw_size + VARHDRSZ);
+ SET_VARSIZE(result, raw_size + VARHDRSZ);
+
+ /*
+ * Support for legacy compressed TOAST format which is always using pglz.
+ */
+ switch (compression_format)
+ {
+ case 0:
+ if (pglz_decompress(TOAST_COMPRESS_RAWDATA(attr),
+ VARSIZE(attr) - TOAST_COMPRESS_HDRSZ,
+ VARDATA(result),
+ raw_size, true) < 0)
+ elog(ERROR, "compressed data is corrupted");
+ break;
+ case 1:
+ if (pg_decompress(TOAST_COMPRESS_RAWDATA(attr),
+ VARSIZE(attr) - TOAST_COMPRESS_HDRSZ,
+ VARDATA(result),
+ raw_size, true) < 0)
+ elog(ERROR, "compressed data is corrupted");
+ break;
+ default:
+ pg_unreachable();
+ }
return result;
}
@@ -2311,10 +2353,10 @@ toast_decompress_datum_slice(struct varlena *attr, int32 slicelength)
result = (struct varlena *) palloc(slicelength + VARHDRSZ);
- rawsize = pglz_decompress(TOAST_COMPRESS_RAWDATA(attr),
- VARSIZE(attr) - TOAST_COMPRESS_HDRSZ,
- VARDATA(result),
- slicelength, false);
+ rawsize = pg_decompress(TOAST_COMPRESS_RAWDATA(attr),
+ VARSIZE(attr) - TOAST_COMPRESS_HDRSZ,
+ VARDATA(result),
+ slicelength, false);
if (rawsize < 0)
elog(ERROR, "compressed data is corrupted");
diff --git a/src/backend/access/transam/xloginsert.c b/src/backend/access/transam/xloginsert.c
index 3ec67d468b..76decbf426 100644
--- a/src/backend/access/transam/xloginsert.c
+++ b/src/backend/access/transam/xloginsert.c
@@ -830,11 +830,12 @@ XLogCompressBackupBlock(char *page, uint16 hole_offset, uint16 hole_length,
source = page;
/*
- * We recheck the actual size even if pglz_compress() reports success and
+ * We recheck the actual size even if pg_compress() reports success and
* see if the number of bytes saved by compression is larger than the
* length of extra data needed for the compressed version of block image.
*/
- len = pglz_compress(source, orig_len, dest, PGLZ_strategy_default);
+ len = pg_compress(source, orig_len, dest, PGLZ_MAX_BLCKSZ,
+ PGLZ_strategy_default);
if (len >= 0 &&
len + extra_bytes < orig_len)
{
diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index fc463601ff..04d99b97bf 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -42,6 +42,7 @@
#include "commands/vacuum.h"
#include "commands/variable.h"
#include "commands/trigger.h"
+#include "common/pg_lzcompress.h"
#include "common/string.h"
#include "funcapi.h"
#include "jit/jit.h"
@@ -470,6 +471,14 @@ static struct config_enum_entry shared_memory_options[] = {
{NULL, 0, false}
};
+static const struct config_enum_entry compression_algorithm_options[] = {
+ {"pglz", COMPRESS_ALGO_PGLZ, false},
+#ifdef HAVE_LZ4
+ {"lz4", COMPRESS_ALGO_LZ4, false},
+#endif
+ {NULL, 0, false}
+};
+
/*
* Options for enum values stored in other modules
*/
@@ -4540,7 +4549,7 @@ static struct config_enum ConfigureNamesEnum[] =
{
{"ssl_max_protocol_version", PGC_SIGHUP, CONN_AUTH_SSL,
- gettext_noop("Sets the maximum SSL/TLS protocol version to use."),
+ gettext_noop("Chooses the compression algorithm for TOAST and WAL."),
NULL,
GUC_SUPERUSER_ONLY
},
@@ -4550,6 +4559,18 @@ static struct config_enum ConfigureNamesEnum[] =
NULL, NULL, NULL
},
+ {
+ {"compression_algorithm", PGC_SIGHUP, RESOURCES_DISK,
+ gettext_noop("Sets the maximum SSL/TLS protocol version to use."),
+ NULL,
+ GUC_SUPERUSER_ONLY
+ },
+ &compression_algorithm,
+ COMPRESS_ALGO_PGLZ,
+ compression_algorithm_options,
+ NULL, NULL, NULL
+ },
+
/* End-of-list marker */
{
{NULL, 0, 0, NULL, NULL}, NULL, 0, NULL, NULL, NULL, NULL
diff --git a/src/common/pg_lzcompress.c b/src/common/pg_lzcompress.c
index 988b3987d0..8e8b966d67 100644
--- a/src/common/pg_lzcompress.c
+++ b/src/common/pg_lzcompress.c
@@ -187,6 +187,7 @@
#include "common/pg_lzcompress.h"
+int compression_algorithm = COMPRESS_ALGO_PGLZ;
/* ----------
* Local definitions
@@ -505,8 +506,8 @@ pglz_find_match(int16 *hstart, const char *input, const char *end,
* bytes written in buffer dest, or -1 if compression fails.
* ----------
*/
-int32
-pglz_compress(const char *source, int32 slen, char *dest,
+static int32
+pglz_compress(const char *source, int32 slen, char *dest, int32 capacity,
const PGLZ_Strategy *strategy)
{
unsigned char *bp = (unsigned char *) dest;
@@ -771,3 +772,132 @@ pglz_decompress(const char *source, int32 slen, char *dest,
*/
return (char *) dp - dest;
}
+
+#ifdef HAVE_LZ4
+#include "utils/elog.h"
+
+static int32
+lz4_compress(const char *source, int32 slen, char *dest, int32 capacity,
+ const PGLZ_Strategy *strategy)
+{
+ int ret;
+
+ ret = LZ4_compress_default(source, dest, slen, capacity);
+
+ /*
+ * In case of compression error, return -1 which callers should take
+ * as incompressible data.
+ */
+ if (ret == 0)
+ return -1;
+
+ return ret;
+}
+
+int32
+lz4_decompress(const char *source, int32 slen, char *dest,
+ int32 rawsize, bool check_complete)
+{
+ int ret;
+
+ if (check_complete)
+ {
+ ret = LZ4_decompress_safe(source, dest, slen, rawsize);
+
+ /*
+ * Check we decompressed the right amount.
+ */
+ if (ret != rawsize)
+ return -1;
+ }
+ else
+ ret = LZ4_decompress_safe_partial(source, dest, slen, rawsize,
+ rawsize);
+
+ return ret;
+}
+#endif
+
+/*
+ * Compress using configured algorithm
+ */
+int32
+pg_compress(const char *source, int32 slen, char *dest, int32 capacity,
+ const PGLZ_Strategy *strategy)
+{
+ int32 ret;
+
+ switch (compression_algorithm)
+ {
+ case COMPRESS_ALGO_PGLZ:
+ dest[0] = COMPRESS_ALGO_PGLZ;
+ ret = pglz_compress(source, slen, &dest[1], capacity - 1,
+ strategy);
+ break;
+#ifdef HAVE_LZ4
+ case COMPRESS_ALGO_LZ4:
+ dest[0] = COMPRESS_ALGO_LZ4;
+ ret = lz4_compress(source, slen, &dest[1], capacity - 1,
+ strategy);
+ break;
+#endif
+ default:
+ pg_unreachable();
+ }
+
+ if (ret >= 0)
+ return ret + 1;
+
+ return ret;
+}
+
+/*
+ * Decompress data compressed with one of the supported algorithms.
+ */
+int32
+pg_decompress(const char *source, int32 slen, char *dest,
+ int32 rawsize, bool check_complete)
+{
+ switch (source[0])
+ {
+ case COMPRESS_ALGO_PGLZ:
+ return pglz_decompress(&source[1], slen - 1, dest, rawsize,
+ check_complete);
+#ifdef HAVE_LZ4
+ case COMPRESS_ALGO_LZ4:
+ return lz4_decompress(&source[1], slen - 1, dest, rawsize,
+ check_complete);
+#endif
+ default:
+ Assert(false); /* XXX: Can't elog here. */
+ }
+
+ pg_unreachable();
+}
+
+/*
+ * Compute the buffer size required by pg_compress for a configured algorithm
+ * including our header size.
+ */
+int32
+pg_compress_bound(int32 slen)
+{
+ switch (compression_algorithm)
+ {
+ case COMPRESS_ALGO_PGLZ:
+ /*
+ * For pglz we allow 4 bytes for overrun before detecting
+ * compression failure.
+ */
+ return slen + 4 + SIZEOF_PG_COMPRESS_HEADER;
+#ifdef HAVE_LZ4
+ case COMPRESS_ALGO_LZ4:
+ /* LZ4 provides direct interface for calculating needed space. */
+ return LZ4_compressBound(slen) + SIZEOF_PG_COMPRESS_HEADER;
+#endif
+ default:
+ pg_unreachable();
+ }
+
+ pg_unreachable();
+}
diff --git a/src/include/common/pg_lzcompress.h b/src/include/common/pg_lzcompress.h
index 555576436c..04453f4574 100644
--- a/src/include/common/pg_lzcompress.h
+++ b/src/include/common/pg_lzcompress.h
@@ -10,15 +10,35 @@
#ifndef _PG_LZCOMPRESS_H_
#define _PG_LZCOMPRESS_H_
+#ifdef HAVE_LZ4
+#include "lz4.h"
-/* ----------
- * PGLZ_MAX_OUTPUT -
+#define SIZEOF_PG_COMPRESS_HEADER 1
+/*
+ * Macro version of pg_compress_bound, less precise, usable in places where
+ * we need compile time size information.
+ * We add +1 compared to what algorithms need because that's the size of
+ * pg_compress header.
+ */
+#define PGLZ_MAX_OUTPUT(_dlen) (Max((_dlen) + 4, LZ4_COMPRESSBOUND(_dlen)) + \
+ SIZEOF_PG_COMPRESS_HEADER)
+#else
+#define PGLZ_MAX_OUTPUT(_dlen) ((_dlen) + 4 + SIZEOF_PG_COMPRESS_HEADER)
+#endif
+
+/*
+ * PGLZCompressionAlgo
*
- * Macro to compute the buffer size required by pglz_compress().
- * We allow 4 bytes for overrun before detecting compression failure.
- * ----------
+ * Which algorithm to use for TOAST and WAL compression.
+ *
+ * COMPRESS_ALGO_PGLZ - use the builtin pglz algorithm
+ * COMPRESS_ALGO_LZ4 - use the LZ4 library
*/
-#define PGLZ_MAX_OUTPUT(_dlen) ((_dlen) + 4)
+typedef enum
+{
+ COMPRESS_ALGO_PGLZ = 0,
+ COMPRESS_ALGO_LZ4
+} PGLZCompressAlgo;
/* ----------
@@ -78,14 +98,25 @@ typedef struct PGLZ_Strategy
extern const PGLZ_Strategy *const PGLZ_strategy_default;
extern const PGLZ_Strategy *const PGLZ_strategy_always;
+/*
+ * Compression algorithm.
+ */
+
+extern int compression_algorithm;
/* ----------
* Global function declarations
* ----------
*/
-extern int32 pglz_compress(const char *source, int32 slen, char *dest,
- const PGLZ_Strategy *strategy);
extern int32 pglz_decompress(const char *source, int32 slen, char *dest,
- int32 rawsize, bool check_complete);
+ int32 rawsize, bool check_complete);
+extern int32 lz4_decompress(const char *source, int32 slen, char *dest,
+ int32 rawsize, bool check_complete);
+extern int32 pg_compress(const char *source, int32 slen, char *dest, int32 capacity,
+ const PGLZ_Strategy *strategy);
+extern int32 pg_decompress(const char *source, int32 slen, char *dest,
+ int32 rawsize, bool check_complete);
+
+extern int32 pg_compress_bound(int32 slen);
#endif /* _PG_LZCOMPRESS_H_ */
diff --git a/src/include/pg_config.h.in b/src/include/pg_config.h.in
index 512213aa32..5e37aab4c5 100644
--- a/src/include/pg_config.h.in
+++ b/src/include/pg_config.h.in
@@ -718,6 +718,9 @@
/* Define to 1 if you have the <uuid/uuid.h> header file. */
#undef HAVE_UUID_UUID_H
+/* Define to 1 to build with LZ4 support. */
+#undef HAVE_LZ4
+
/* Define to 1 if you have the <wchar.h> header file. */
#undef HAVE_WCHAR_H
diff --git a/src/include/postgres.h b/src/include/postgres.h
index 057a3413ac..9578588880 100644
--- a/src/include/postgres.h
+++ b/src/include/postgres.h
@@ -145,7 +145,8 @@ typedef union
struct /* Compressed-in-line format */
{
uint32 va_header;
- uint32 va_rawsize; /* Original data size (excludes header) */
+ uint32 va_rawsize:31; /* Original data size (excludes header) */
+ int va_cformat:1;
char va_data[FLEXIBLE_ARRAY_MEMBER]; /* Compressed data */
} va_compressed;
} varattrib_4b;
diff --git a/src/test/Makefile b/src/test/Makefile
index efb206aa75..c6287e7a04 100644
--- a/src/test/Makefile
+++ b/src/test/Makefile
@@ -32,12 +32,18 @@ ifneq (,$(filter ssl,$(PG_TEST_EXTRA)))
SUBDIRS += ssl
endif
endif
+ifeq ($(with_lz4),yes)
+ifneq (,$(filter toast,$(PG_TEST_EXTRA)))
+SUBDIRS += toast
+endif
+endif
# We don't build or execute these by default, but we do want "make
# clean" etc to recurse into them. (We must filter out those that we
# have conditionally included into SUBDIRS above, else there will be
# make confusion.)
-ALWAYS_SUBDIRS = $(filter-out $(SUBDIRS),examples kerberos ldap locale thread ssl)
+ALWAYS_SUBDIRS = $(filter-out $(SUBDIRS),examples kerberos ldap locale thread \
+ ssl toast)
# We want to recurse to all subdirs for all standard targets, except that
# installcheck and install should not recurse into the subdirectory "modules".
diff --git a/src/test/toast/.gitignore b/src/test/toast/.gitignore
new file mode 100644
index 0000000000..871e943d50
--- /dev/null
+++ b/src/test/toast/.gitignore
@@ -0,0 +1,2 @@
+# Generated by test suite
+/tmp_check/
diff --git a/src/test/toast/Makefile b/src/test/toast/Makefile
new file mode 100644
index 0000000000..5a1ff09a13
--- /dev/null
+++ b/src/test/toast/Makefile
@@ -0,0 +1,25 @@
+#-------------------------------------------------------------------------
+#
+# Makefile for src/test/toast
+#
+# Portions Copyright (c) 1996-2019, PostgreSQL Global Development Group
+# Portions Copyright (c) 1994, Regents of the University of California
+#
+# src/test/toast/Makefile
+#
+#-------------------------------------------------------------------------
+
+subdir = src/test/recovery
+top_builddir = ../../..
+include $(top_builddir)/src/Makefile.global
+
+export with_lz4
+
+check:
+ $(prove_check)
+
+installcheck:
+ $(prove_installcheck)
+
+clean distclean maintainer-clean:
+ rm -rf tmp_check
diff --git a/src/test/toast/README b/src/test/toast/README
new file mode 100644
index 0000000000..8802ecbe06
--- /dev/null
+++ b/src/test/toast/README
@@ -0,0 +1,25 @@
+src/test/tosat/README
+
+Regression tests for TOAST compression
+======================================
+
+This directory contains a test suite for TOAST compression and replication.
+
+Running the tests
+=================
+
+NOTE: You must have given the --enable-tap-tests argument to configure.
+Also, to use "make installcheck", you must have built and installed
+contrib/test_decoding in addition to the core code.
+
+Run
+ make check
+or
+ make installcheck
+You can use "make installcheck" if you previously did "make install".
+In that case, the code in the installation tree is tested. With
+"make check", a temporary installation tree is built from the current
+sources and then tested.
+
+Either way, this test initializes, starts, and stops several test Postgres
+clusters.
diff --git a/src/test/toast/t/001_lz4.pl b/src/test/toast/t/001_lz4.pl
new file mode 100644
index 0000000000..89f7cd177f
--- /dev/null
+++ b/src/test/toast/t/001_lz4.pl
@@ -0,0 +1,124 @@
+use strict;
+use warnings;
+use PostgresNode;
+use TestLib;
+use Test::More;
+
+use File::Copy;
+
+use FindBin;
+use lib $FindBin::RealBin;
+
+if ($ENV{with_lz4} eq 'yes')
+{
+ plan tests => 10;
+}
+else
+{
+ plan skip_all => 'LZ4 not supported by this build';
+}
+
+#### Set up the server.
+note "setting up data directory";
+my $node = get_new_node('master');
+$node->init;
+$node->append_conf('postgresql.conf', qq[
+compression_algorithm = lz4
+wal_compression = on
+]);
+$node->start;
+
+# Run this before we lock down access below.
+my $result = $node->safe_psql('postgres', "SHOW compression_algorithm");
+is($result, 'lz4', 'compression_algorithm set to lz4');
+
+$node->safe_psql('postgres',
+ qq[
+ CREATE TABLE toast_test (
+ id int,
+ data text
+ )]);
+
+$node->safe_psql('postgres',
+ 'ALTER TABLE toast_test ALTER COLUMN data SET STORAGE MAIN');
+
+# This will actually be compressed inline as it's easy to compress
+$node->safe_psql('postgres',
+qq[
+ INSERT INTO toast_test
+ SELECT n, repeat('toasted', 1000)
+ FROM generate_series(1, 100) s(n);
+]);
+
+
+my $toast_size = $node->safe_psql('postgres',
+qq[
+ SELECT pg_relation_size((SELECT reltoastrelid FROM pg_catalog.pg_class WHERE relname = 'toast_test'));
+]);
+
+ok($toast_size == 0, 'toast table is used');
+
+$node->safe_psql('postgres',
+ 'ALTER TABLE toast_test ALTER COLUMN data SET STORAGE EXTENDED');
+
+# Something less easily compressable so that it's in TOAST table
+$node->safe_psql('postgres',
+qq[
+ INSERT INTO toast_test
+ SELECT n, (SELECT string_agg(md5(t::text),'')
+ FROM generate_series(1, 200) q(t))
+ FROM generate_series(101, 200) s(n);
+]);
+
+$toast_size = $node->safe_psql('postgres',
+qq[
+ SELECT pg_relation_size((SELECT reltoastrelid FROM pg_catalog.pg_class WHERE relname = 'toast_test'));
+]);
+
+ok($toast_size > 0, 'toast table is used');
+
+# check if we can select data
+is($node->safe_psql('postgres',
+ qq[SELECT id, length(data) FROM toast_test WHERE id = 1]),
+ '1|7000', 'can select compressed data');
+is($node->safe_psql('postgres',
+ qq[SELECT id, length(data) FROM toast_test WHERE id = 200]),
+ '200|6400', 'can select TOAST compressed data');
+
+# test slicing
+is($node->safe_psql('postgres',
+ qq[SELECT id, substr(data, 1, 10) FROM toast_test WHERE id = 50]),
+ '50|toastedtoa', 'slicing of compressed data works');
+
+is($node->safe_psql('postgres',
+ qq[SELECT id, substr(data, 1, 10) FROM toast_test WHERE id = 150]),
+ '150|c4ca4238a0', 'slicing of TOAST works');
+
+$node->append_conf('postgresql.conf', qq[
+compression_algorithm = pglz
+]);
+$node->reload;
+
+# Run this before we lock down access below.
+$result = $node->safe_psql('postgres', "SHOW compression_algorithm");
+is($result, 'pglz', 'compression_algorithm set to pglz');
+
+$node->safe_psql('postgres',
+qq[
+ INSERT INTO toast_test
+ SELECT n, (SELECT string_agg(md5(t::text),'')
+ FROM generate_series(1, 200) q(t))
+ FROM generate_series(201, 300) s(n);
+]);
+
+is($node->safe_psql('postgres',
+ qq[SELECT id, length(data) FROM toast_test WHERE id IN (200, 201)]),
+q[200|6400
+201|6400], 'can select TOAST with different compression for different rows');
+
+is($node->safe_psql('postgres',
+ qq[SELECT id, substr(data, 1, 10) FROM toast_test WHERE id IN (150, 250)]),
+q[150|c4ca4238a0
+250|c4ca4238a0], 'slicing of TOAST works with different compression for different row');
+
+done_testing();
--
2.20.1