This is an automated email from the ASF dual-hosted git repository. maxyang pushed a commit to branch main in repository https://gitbox.apache.org/repos/asf/cloudberry.git
commit 81d4dd86561583159a9ac709c682e00c468ecd97 Author: Marbin Tan <[email protected]> AuthorDate: Tue May 9 14:22:28 2023 -0700 Enable `wal_compression` by default Based on recent findinds, `wal_compression` may give huge benefits when workloads generates large amounts of FPIs (Full Page Images). With `wal_compression`, the WAL generated can be reduced by 20-30% and even higher when loading high volumes of data causes checkpoints to occur too frequently; checkpoints occuring too frequently causes more FPIs to be written to WAL. When the amount of WAL generated is reduced, this will inherently reduce the amount of data being transferred via interconnect to the mirrors. Furthermore, there will be a reduction in disk I/O as less data are generated. In a multi-node with mirrored environment, the overall duration of load will be reduced by 6-15%. `wal_compression` does not directly effect AO/CO, however, as part of AO/CO is comprised of auxiliary heap tables, there is a small gain in performance. This can be seen with a 1TB TPC-DS AO/CO load duration speed-up of 4%. Note that `full_page_writes` must be on to see any benefits mentioned above. ZSTD allocates its own memory. Although this is usually a concern, it is not in our case since a single backend is only compressing a 32K page at a given time. We may have multiple backends in a single host, but this will not significally increase memory usage. For further details of the discussion, see gpdb-dev mailing list: https://groups.google.com/a/greenplum.org/g/gpdb-dev/c/0PyQvNB2aNI ----------- Test update ----------- Due to enabling `wal_compression`, the test to throttle no longer triggers correctly; increase the amount of rows to trigger the expected behavior again. Reviewed-by: Soumyadeep Chakraborty <[email protected]> Reviewed-by: Ashwin Agrawal <[email protected]> --- src/backend/utils/misc/guc.c | 2 +- src/backend/utils/misc/postgresql.conf.sample | 2 +- src/test/isolation2/expected/segwalrep/select_throttle.out | 10 +++++----- src/test/isolation2/sql/segwalrep/select_throttle.sql | 2 +- 4 files changed, 8 insertions(+), 8 deletions(-) diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c index d661ecffc8..521489c3b7 100644 --- a/src/backend/utils/misc/guc.c +++ b/src/backend/utils/misc/guc.c @@ -1387,7 +1387,7 @@ static struct config_bool ConfigureNamesBool[] = NULL }, &wal_compression, - false, + true, NULL, NULL, NULL }, diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample index c44dc92899..0908c41223 100644 --- a/src/backend/utils/misc/postgresql.conf.sample +++ b/src/backend/utils/misc/postgresql.conf.sample @@ -223,7 +223,7 @@ max_prepared_transactions = 250 # can be 0 or more #full_page_writes = on # recover from partial page writes #wal_log_hints = off # also do full page writes of non-critical updates # (change requires restart) -#wal_compression = off # enable compression of full-page writes +#wal_compression = on # enable compression of full-page writes #wal_init_zero = on # zero-fill new WAL files #wal_recycle = on # recycle WAL files #wal_buffers = -1 # min 32kB, -1 sets based on shared_buffers diff --git a/src/test/isolation2/expected/segwalrep/select_throttle.out b/src/test/isolation2/expected/segwalrep/select_throttle.out index 1eea7e35d1..1d72208593 100644 --- a/src/test/isolation2/expected/segwalrep/select_throttle.out +++ b/src/test/isolation2/expected/segwalrep/select_throttle.out @@ -26,8 +26,8 @@ INSERT INTO select_no_throttle SELECT generate_series (1, 10); INSERT 10 CREATE TABLE select_throttle(a int) DISTRIBUTED BY (a); CREATE -INSERT INTO select_throttle SELECT generate_series (1, 100000); -INSERT 100000 +INSERT INTO select_throttle SELECT generate_series (1, 900000); +INSERT 900000 -- Enable tuple hints so that buffer will be marked dirty upon a hint bit change -- (so that we don't have to wait for the tuple to age. See logic in markDirty) @@ -82,9 +82,9 @@ SELECT gp_inject_fault_infinite('wal_sender_loop', 'reset', dbid) FROM gp_segmen -- after this, system continue to proceed 1U<: <... completed> - count -------- - 33327 + count +-------- + 299393 (1 row) SELECT wait_until_all_segments_synchronized(); diff --git a/src/test/isolation2/sql/segwalrep/select_throttle.sql b/src/test/isolation2/sql/segwalrep/select_throttle.sql index 5094e2b1ed..a0223ed868 100644 --- a/src/test/isolation2/sql/segwalrep/select_throttle.sql +++ b/src/test/isolation2/sql/segwalrep/select_throttle.sql @@ -18,7 +18,7 @@ SELECT pg_reload_conf(); CREATE TABLE select_no_throttle(a int) DISTRIBUTED BY (a); INSERT INTO select_no_throttle SELECT generate_series (1, 10); CREATE TABLE select_throttle(a int) DISTRIBUTED BY (a); -INSERT INTO select_throttle SELECT generate_series (1, 100000); +INSERT INTO select_throttle SELECT generate_series (1, 900000); -- Enable tuple hints so that buffer will be marked dirty upon a hint bit change -- (so that we don't have to wait for the tuple to age. See logic in markDirty) --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
