On Mon, Dec 23, 2024 at 12:50 PM Melanie Plageman <melanieplage...@gmail.com> wrote: > > The other "worst case" is just that you always scan and fail to freeze > an extra 3% of the relation while vacuuming the table. This one is > much easier to achieve. As such, it seems worthwhile to add a GUC and > table option to tune the EAGER_SCAN_MAX_FAILS_PER_REGION such that you > can disable eager scanning altogether (or increase or decrease how > aggressive it is).
Attached v5 adds the GUC and table storage option controlling the maximum number of eager scan failures tolerated for each region of the table. 0001 is a version of a docs patch proposed in [1]. It adds a new Vacuuming subsection of the Server Configuration docs where the proposed GUC in this patch is added in 0003. While adding the table storage option and GUC, I struggled a bit with where in the code to actually determine the final value of vacuum_eager_scan_max_fails. For table storage options, those related to vacuum but not autovacuum are in the main StdRdOptions struct. Of those, some are overridden by VACUUM command parameters which are parsed out into the VacuumParams struct. Though the members of VacuumParams are initialized in ExecVacuum(), the storage parameter overrides are determined in vacuum_rel() and the final value goes in the VacuumParams struct which is passed all the way through to heap_vacuum_rel(). Because VacuumParams is what ultimately gets passed down to the table-AM specific vacuum implementation, autovacuum also initializes its own instance of VacuumParams in the autovac_table struct in table_recheck_autovac() (even though no VACUUM command parameters can affect autovacuum). These are overridden in vacuum_rel() as well. Ultimately vacuum_eager_scan_max_fails is a bit different from the existing members of VacuumParams and StdRdOptions. It is a GUC and a table storage option but not a SQL command parameter -- and both the GUC and the table storage parameter affect both vacuum and autovacuum. And it doesn't need to be initialized in different ways for autovacuum and vacuum. In the end, I decided to follow the existing conventions as closely as I could. - Melanie [1] https://www.postgresql.org/message-id/CAAKRu_aQUOaMYrcjNuXeSkJtaX9oRUzKP57bsYbC0gVVWS%2BcbA%40mail.gmail.com
From d8f497508f2ef0eef3a6ef20b543bd2e0ad14551 Mon Sep 17 00:00:00 2001 From: Melanie Plageman <melanieplage...@gmail.com> Date: Tue, 7 Jan 2025 09:48:34 -0500 Subject: [PATCH v5 3/3] Eagerly scan all-visible pages to amortize aggressive vacuum Introduce eager scanning normal vacuums, in which vacuum scans some of the all-visible but not all-frozen pages in the relation to amortize the cost of an aggressive vacuum. Because the goal is to freeze these all-visible pages, all-visible pages that are eagerly scanned and set all-frozen in the visibility map are considered successful eager scans and those not frozen are considered failed eager scans. If too many eager scans fail in a row, eager scanning is temporarily suspended until a later portion of the relation. The number of failures tolerated is configurable globally and per table. To effectively amortize aggressive vacuums, we cap the number of successes as well. Once we reach the maximum number of blocks successfully eager scanned and frozen, eager scanning is permanently disabled for the current vacuum. Original design idea from Robert Haas, with enhancements from Andres Freund, Tomas Vondra, and me Author: Melanie Plageman Reviewed-by: Andres Freund, Robert Haas, Robert Treat, Bilal Yavuz Discussion: https://postgr.es/m/flat/CAAKRu_ZF_KCzZuOrPrOqjGVe8iRVWEAJSpzMgRQs%3D5-v84cXUg%40mail.gmail.com --- doc/src/sgml/config.sgml | 20 + doc/src/sgml/ref/create_table.sgml | 15 + src/backend/access/common/reloptions.c | 13 +- src/backend/access/heap/vacuumlazy.c | 382 ++++++++++++++++-- src/backend/commands/vacuum.c | 13 + src/backend/postmaster/autovacuum.c | 2 + src/backend/utils/misc/guc_tables.c | 10 + src/backend/utils/misc/postgresql.conf.sample | 1 + src/include/commands/vacuum.h | 23 ++ src/include/utils/rel.h | 7 + 10 files changed, 458 insertions(+), 28 deletions(-) diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml index 3b2430eff55..bd5a63c8661 100644 --- a/doc/src/sgml/config.sgml +++ b/doc/src/sgml/config.sgml @@ -9100,6 +9100,26 @@ COPY postgres_log FROM '/full/path/to/logfile.csv' WITH csv; </listitem> </varlistentry> + <varlistentry id="guc-vacuum-eager-scan-max-fails" xreflabel="vacuum_eager_scan_max_fails"> + <term><varname>vacuum_eager_scan_max_fails</varname> (<type>integer</type>) + <indexterm> + <primary><varname>vacuum_eager_scan_max_fails</varname> configuration parameter</primary> + </indexterm> + </term> + <listitem> + <para> + Specifies the maximum number of all-visible pages that + <command>VACUUM</command> may scan and fail to set all-frozen in the + visibility map before disabling eager scanning until the next region + (currently 4096 blocks) of the relation. A value of 0 disables eager + scanning altogether. The default is 128. This parameter can be set in + postgresql.conf or on the server command line but is overridden for + individual tables by changing the corresponding table storage + parameter. + </para> + </listitem> + </varlistentry> + </variablelist> </sect2> </sect1> diff --git a/doc/src/sgml/ref/create_table.sgml b/doc/src/sgml/ref/create_table.sgml index 70fa929caa4..ab1516b5e9d 100644 --- a/doc/src/sgml/ref/create_table.sgml +++ b/doc/src/sgml/ref/create_table.sgml @@ -1901,6 +1901,21 @@ WITH ( MODULUS <replaceable class="parameter">numeric_literal</replaceable>, REM </listitem> </varlistentry> + <varlistentry id="reloption-vacuum-eager-scan-max-fails" xreflabel="vacuum_eager_scan_max_fails"> + <term><literal>vacuum_eager_scan_max_fails</literal>, <literal>toast.vacuum_eager_scan_max_fails</literal> (<type>integer</type>) + <indexterm> + <primary><varname>vacuum_eager_scan_max_fails</varname></primary> + <secondary>storage parameter</secondary> + </indexterm> + </term> + <listitem> + <para> + Per-table value for <xref linkend="guc-vacuum-eager-scan-max-fails"/> + parameter. + </para> + </listitem> + </varlistentry> + <varlistentry id="reloption-user-catalog-table" xreflabel="user_catalog_table"> <term><literal>user_catalog_table</literal> (<type>boolean</type>) <indexterm> diff --git a/src/backend/access/common/reloptions.c b/src/backend/access/common/reloptions.c index e587abd9990..daff9f1fa8d 100644 --- a/src/backend/access/common/reloptions.c +++ b/src/backend/access/common/reloptions.c @@ -27,6 +27,7 @@ #include "catalog/pg_type.h" #include "commands/defrem.h" #include "commands/tablespace.h" +#include "commands/vacuum.h" #include "nodes/makefuncs.h" #include "utils/array.h" #include "utils/attoptcache.h" @@ -319,6 +320,14 @@ static relopt_int intRelOpts[] = }, -1, -1, INT_MAX }, + { + { + "vacuum_eager_scan_max_fails", + "Maximum number of all-visible pages that vacuum will eagerly scan and fail to freeze before giving up on eager scanning until the next region", + RELOPT_KIND_HEAP | RELOPT_KIND_TOAST, + ShareUpdateExclusiveLock + }, -1, 0, VACUUM_EAGER_SCAN_REGION_SIZE + }, { { "toast_tuple_target", @@ -1880,7 +1889,9 @@ default_reloptions(Datum reloptions, bool validate, relopt_kind kind) {"vacuum_index_cleanup", RELOPT_TYPE_ENUM, offsetof(StdRdOptions, vacuum_index_cleanup)}, {"vacuum_truncate", RELOPT_TYPE_BOOL, - offsetof(StdRdOptions, vacuum_truncate)} + offsetof(StdRdOptions, vacuum_truncate)}, + {"vacuum_eager_scan_max_fails", RELOPT_TYPE_INT, + offsetof(StdRdOptions, vacuum_eager_scan_max_fails)} }; return (bytea *) build_reloptions(reloptions, validate, kind, diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c index c047fb20f7a..6c04b963c51 100644 --- a/src/backend/access/heap/vacuumlazy.c +++ b/src/backend/access/heap/vacuumlazy.c @@ -30,10 +30,47 @@ * to the end, skipping pages as permitted by their visibility status, vacuum * options, and the eagerness level of the vacuum. * - * When page skipping is enabled, non-aggressive vacuums may skip scanning - * pages that are marked all-visible in the visibility map. We may choose not - * to skip pages if the range of skippable pages is below - * SKIP_PAGES_THRESHOLD. + * Vacuums are either aggressive or normal. Aggressive vacuums must scan every + * unfrozen tuple in order to advance relfrozenxid and avoid transaction ID + * wraparound. Normal vacuums may scan otherwise skippable pages for one of + * two reasons: + * + * When page skipping is not disabled, a normal vacuum may scan pages that are + * marked all-visible (and even all-frozen) in the visibility map if the range + * of skippable pages is below SKIP_PAGES_THRESHOLD. This is primarily for the + * benefit of kernel readahead (see comment in heap_vac_scan_next_block()). + * + * A normal vacuum may also scan skippable pages in an effort to freeze them + * and decrease the backlog of all-visible but not all-frozen pages that have + * to be processed by the next aggressive vacuum. These are referred to as + * eagerly scanned pages. Pages scanned due to SKIP_PAGES_THRESHOLD do not + * count as eagerly scanned pages. + * + * Normal vacuums count all-visible pages eagerly scanned as a success when + * they are able to set them all-frozen in the VM and as a failure when they + * are not able to set them all-frozen. + * + * Because we want to amortize the overhead of freezing pages over multiple + * vacuums, normal vacuums cap the number of successful eager scans to + * EAGER_SCAN_SUCCESS_RATE of the number of all-visible but not all-frozen + * pages at the beginning of the vacuum. Once the success cap has been hit, + * eager scanning is permanently disabled. + * + * Success is a global cap because we don't want to limit our successes if old + * data happens to be concentrated in a particular part of the table. This is + * especially likely to happen for append-mostly workloads where the oldest + * data is at the beginning of the unfrozen portion of the relation. + * + * On the assumption that different regions of the table are likely to contain + * similarly aged data, normal vacuums use a localized eager scan failure cap + * instead of a global cap for the whole relation. The failure count is reset + * for each region of the table -- comprised of VACUUM_EAGER_SCAN_REGION_SIZE + * blocks. In each region, we tolerate vacuum_eager_scan_max_fails before + * suspending eager scanning until the end of the region. + * vacuum_eager_scan_max_fails is configurable both globally and per table. + * + * Aggressive vacuums must examine every unfrozen tuple and thus are not + * subject to any of the limits imposed by the eager scanning algorithm. * * Once vacuum has decided to scan a given block, it must read in the block * and obtain a cleanup lock to prune tuples on the page. A non-aggressive @@ -88,6 +125,7 @@ #include "commands/progress.h" #include "commands/vacuum.h" #include "common/int.h" +#include "common/pg_prng.h" #include "executor/instrument.h" #include "miscadmin.h" #include "pgstat.h" @@ -173,6 +211,15 @@ typedef enum VACUUM_ERRCB_PHASE_TRUNCATE, } VacErrPhase; +/* + * An eager scan of a page that is set all-frozen in the VM is considered + * "successful". To spread out eager scanning across multiple normal vacuums, + * we limit the number of successful eager page scans. The maximum number of + * successful eager page scans is calculated as a ratio of the all-visible but + * not all-frozen pages at the beginning of the vacuum. + */ +#define EAGER_SCAN_SUCCESS_RATE 0.2 + typedef struct LVRelState { /* Target heap relation and its indexes */ @@ -229,6 +276,13 @@ typedef struct LVRelState BlockNumber rel_pages; /* total number of pages */ BlockNumber scanned_pages; /* # pages examined (not skipped via VM) */ + + /* + * Count of all-visible blocks eagerly scanned (for logging only). This + * does not include skippable blocks scanned due to SKIP_PAGES_THRESHOLD. + */ + BlockNumber eager_scanned_pages; + BlockNumber removed_pages; /* # pages removed by relation truncation */ BlockNumber new_frozen_tuple_pages; /* # pages with newly frozen tuples */ @@ -270,9 +324,55 @@ typedef struct LVRelState BlockNumber current_block; /* last block returned */ BlockNumber next_unskippable_block; /* next unskippable block */ bool next_unskippable_allvis; /* its visibility status */ + bool next_unskippable_eager_scanned; /* if it was eager scanned */ Buffer next_unskippable_vmbuffer; /* buffer containing its VM bit */ + + /* State related to managing eager scanning of all-visible pages */ + + /* + * A normal vacuum that has failed to freeze too many eagerly scanned + * blocks in a row suspends eager scanning. next_eager_scan_region_start + * is the block number of the first block eligible for resumed eager + * scanning. + * + * When eager scanning is permanently disabled, either initially + * (including for aggressive vacuum) or due to hitting the success limit, + * this is set to InvalidBlockNumber. + */ + BlockNumber next_eager_scan_region_start; + + /* + * The remaining number of blocks a normal vacuum will consider eager + * scanning. When eager scanning is enabled, this is initialized to + * EAGER_SCAN_SUCCESS_RATE of the total number of all-visible but not + * all-frozen pages. For each eager scan success, this is decremented. + * Once it hits 0, eager scanning is permanently disabled. It is + * initialized to 0 if eager scanning starts out disabled (including for + * aggressive vacuum). + */ + BlockNumber eager_scan_remaining_successes; + + /* + * The number of eagerly scanned blocks vacuum failed to freeze (due to + * age) in the current eager scan region. Vacuum resets it to + * vacuum_eager_scan_max_fails each time it enters a new region of the + * relation. If eager_scan_remaining_fails hits 0, eager scanning is + * suspended until the next region. It is also 0 if eager scanning has + * been permanently disabled. + */ + BlockNumber eager_scan_remaining_fails; + + /* + * The maximum number of blocks which may be eager scanned and not frozen + * before eager scanning is temporarily suspended. This is configurable + * both globally, via the vacuum_eager_scan_max_fails GUC, and per table, + * with a table storage parameter of the same name. It is 0 when eager + * scanning is disabled. + */ + BlockNumber eager_scan_max_fails_per_region; } LVRelState; + /* Struct for saving and restoring vacuum error information. */ typedef struct LVSavedErrInfo { @@ -284,8 +384,10 @@ typedef struct LVSavedErrInfo /* non-export function prototypes */ static void lazy_scan_heap(LVRelState *vacrel); +static void heap_vacuum_eager_scan_setup(LVRelState *vacrel, VacuumParams *params); static bool heap_vac_scan_next_block(LVRelState *vacrel, BlockNumber *blkno, - bool *all_visible_according_to_vm); + bool *all_visible_according_to_vm, + bool *was_eager_scanned); static void find_next_unskippable_block(LVRelState *vacrel, bool *skipsallvis); static bool lazy_scan_new_or_empty(LVRelState *vacrel, Buffer buf, BlockNumber blkno, Page page, @@ -293,7 +395,7 @@ static bool lazy_scan_new_or_empty(LVRelState *vacrel, Buffer buf, static void lazy_scan_prune(LVRelState *vacrel, Buffer buf, BlockNumber blkno, Page page, Buffer vmbuffer, bool all_visible_according_to_vm, - bool *has_lpdead_items); + bool *has_lpdead_items, bool *vm_page_frozen); static bool lazy_scan_noprune(LVRelState *vacrel, Buffer buf, BlockNumber blkno, Page page, bool *has_lpdead_items); @@ -335,6 +437,121 @@ static void restore_vacuum_error_info(LVRelState *vacrel, const LVSavedErrInfo *saved_vacrel); + +/* + * Helper to set up the eager scanning state for vacuuming a single relation. + * Initializes the eager scan management related members of the LVRelState. + * + * Caller provides whether or not an aggressive vacuum is required due to + * vacuum options or for relfrozenxid/relminmxid advancement. + */ +static void +heap_vacuum_eager_scan_setup(LVRelState *vacrel, VacuumParams *params) +{ + uint32 randseed; + BlockNumber allvisible; + BlockNumber allfrozen; + float first_region_ratio; + bool oldest_unfrozen_requires_freeze = false; + + /* + * Initialize eager scan management fields to their disabled values. + * Aggressive vacuums, normal vacuums of small tables, and normal vacuums + * of tables without sufficiently old tuples disable eager scanning. + */ + vacrel->next_eager_scan_region_start = InvalidBlockNumber; + vacrel->eager_scan_max_fails_per_region = 0; + vacrel->eager_scan_remaining_fails = 0; + vacrel->eager_scan_remaining_successes = 0; + + /* If eager scanning is explicitly disabled, just return. */ + if (params->eager_scan_max_fails == 0) + return; + + /* + * The caller will have determined whether or not an aggressive vacuum is + * required by either the vacuum parameters or the relative age of the + * oldest unfrozen transaction IDs. An aggressive vacuum must scan every + * all-visible page to safely advance the relfrozenxid and/or relminmxid, + * so scans of all-visible pages are not considered eager. + */ + if (vacrel->aggressive) + return; + + /* + * If the relation is smaller than a single region, we won't bother eager + * scanning it. A future aggressive vacuum shouldn't take very long, so + * there is no point in amortization. + */ + if (vacrel->rel_pages < VACUUM_EAGER_SCAN_REGION_SIZE) + return; + + Assert(params->eager_scan_max_fails >= 0 && + params->eager_scan_max_fails <= 4096); + + /* + * We only want to enable eager scanning if we are likely to be able to + * freeze some of the pages in the relation. We are only guaranteed to + * freeze a page if some of the tuples _require_ freezing. Tuples require + * freezing if any of their xids precede the freeze limit or multixact + * cutoff (calculated from vacuum_[multixact_]freeze_min_age). So, if the + * oldest unfrozen xid (relfrozenxid/relminmxid) does not precede the + * freeze cutoff, we won't find tuples requiring freezing. + */ + if (TransactionIdIsNormal(vacrel->cutoffs.relfrozenxid) && + TransactionIdPrecedesOrEquals(vacrel->cutoffs.relfrozenxid, + vacrel->cutoffs.FreezeLimit)) + oldest_unfrozen_requires_freeze = true; + + if (!oldest_unfrozen_requires_freeze && + MultiXactIdIsValid(vacrel->cutoffs.relminmxid) && + MultiXactIdPrecedesOrEquals(vacrel->cutoffs.relminmxid, + vacrel->cutoffs.MultiXactCutoff)) + oldest_unfrozen_requires_freeze = true; + + if (!oldest_unfrozen_requires_freeze) + return; + + /* We have met the criteria to eagerly scan some pages. */ + + /* + * Our success cap is EAGER_SCAN_SUCCESS_RATE of the number of all-visible + * but not all-frozen blocks in the relation. + */ + visibilitymap_count(vacrel->rel, &allvisible, &allfrozen); + + vacrel->eager_scan_remaining_successes = + (BlockNumber) (EAGER_SCAN_SUCCESS_RATE * + (allvisible - allfrozen)); + + /* If the table is entirely frozen, eager scanning is disabled. */ + if (vacrel->eager_scan_remaining_successes == 0) + return; + + /* + * Now calculate the eager scan start block. Start at a random spot + * somewhere within the first eager scan region. This avoids eager + * scanning and failing to freeze the exact same blocks each vacuum of the + * relation. + */ + randseed = pg_prng_uint32(&pg_global_prng_state); + + vacrel->next_eager_scan_region_start = randseed % + VACUUM_EAGER_SCAN_REGION_SIZE; + + vacrel->eager_scan_max_fails_per_region = params->eager_scan_max_fails; + + /* + * The first region will be smaller than subsequent regions. As such, + * adjust the eager scan failures tolerated for this region. + */ + first_region_ratio = 1 - (float) vacrel->next_eager_scan_region_start / + VACUUM_EAGER_SCAN_REGION_SIZE; + + vacrel->eager_scan_remaining_fails = vacrel->eager_scan_max_fails_per_region * + first_region_ratio; +} + /* * heap_vacuum_rel() -- perform VACUUM for one heap relation * @@ -463,6 +680,7 @@ heap_vacuum_rel(Relation rel, VacuumParams *params, /* Initialize page counters explicitly (be tidy) */ vacrel->scanned_pages = 0; + vacrel->eager_scanned_pages = 0; vacrel->removed_pages = 0; vacrel->new_frozen_tuple_pages = 0; vacrel->lpdead_item_pages = 0; @@ -488,6 +706,7 @@ heap_vacuum_rel(Relation rel, VacuumParams *params, vacrel->vm_new_visible_pages = 0; vacrel->vm_new_visible_frozen_pages = 0; vacrel->vm_new_frozen_pages = 0; + vacrel->rel_pages = orig_rel_pages = RelationGetNumberOfBlocks(rel); /* * Get cutoffs that determine which deleted tuples are considered DEAD, @@ -506,11 +725,16 @@ heap_vacuum_rel(Relation rel, VacuumParams *params, * to increase the number of dead tuples it can prune away.) */ vacrel->aggressive = vacuum_get_cutoffs(rel, params, &vacrel->cutoffs); - vacrel->rel_pages = orig_rel_pages = RelationGetNumberOfBlocks(rel); vacrel->vistest = GlobalVisTestFor(rel); /* Initialize state used to track oldest extant XID/MXID */ vacrel->NewRelfrozenXid = vacrel->cutoffs.OldestXmin; vacrel->NewRelminMxid = vacrel->cutoffs.OldestMxact; + + /* + * Initialize state related to tracking all-visible page skipping. This is + * very important to determine whether or not it is safe to advance the + * relfrozenxid/relminmxid. + */ vacrel->skippedallvis = false; skipwithvm = true; if (params->options & VACOPT_DISABLE_PAGE_SKIPPING) @@ -525,6 +749,13 @@ heap_vacuum_rel(Relation rel, VacuumParams *params, vacrel->skipwithvm = skipwithvm; + /* + * Set up eager scan tracking state. This must happen after determining + * whether or not the vacuum must be aggressive, because only normal + * vacuums use the eager scan algorithm. + */ + heap_vacuum_eager_scan_setup(vacrel, params); + if (verbose) { if (vacrel->aggressive) @@ -719,12 +950,14 @@ heap_vacuum_rel(Relation rel, VacuumParams *params, vacrel->relnamespace, vacrel->relname, vacrel->num_index_scans); - appendStringInfo(&buf, _("pages: %u removed, %u remain, %u scanned (%.2f%% of total)\n"), + appendStringInfo(&buf, _("pages: %u removed, %u remain, %u scanned (%.2f%% of total), %u eager scanned\n"), vacrel->removed_pages, new_rel_pages, vacrel->scanned_pages, orig_rel_pages == 0 ? 100.0 : - 100.0 * vacrel->scanned_pages / orig_rel_pages); + 100.0 * vacrel->scanned_pages / + orig_rel_pages, + vacrel->eager_scanned_pages); appendStringInfo(&buf, _("tuples: %lld removed, %lld remain, %lld are dead but not yet removable\n"), (long long) vacrel->tuples_deleted, @@ -895,8 +1128,10 @@ lazy_scan_heap(LVRelState *vacrel) BlockNumber rel_pages = vacrel->rel_pages, blkno, next_fsm_block_to_vacuum = 0; - bool all_visible_according_to_vm; - + bool all_visible_according_to_vm, + was_eager_scanned = false; + BlockNumber orig_eager_scan_success_limit = + vacrel->eager_scan_remaining_successes; /* for logging */ Buffer vmbuffer = InvalidBuffer; const int initprog_index[] = { PROGRESS_VACUUM_PHASE, @@ -915,13 +1150,16 @@ lazy_scan_heap(LVRelState *vacrel) vacrel->current_block = InvalidBlockNumber; vacrel->next_unskippable_block = InvalidBlockNumber; vacrel->next_unskippable_allvis = false; + vacrel->next_unskippable_eager_scanned = false; vacrel->next_unskippable_vmbuffer = InvalidBuffer; - while (heap_vac_scan_next_block(vacrel, &blkno, &all_visible_according_to_vm)) + while (heap_vac_scan_next_block(vacrel, &blkno, &all_visible_according_to_vm, + &was_eager_scanned)) { Buffer buf; Page page; bool has_lpdead_items; + bool vm_page_frozen = false; bool got_cleanup_lock = false; vacrel->scanned_pages++; @@ -1049,7 +1287,46 @@ lazy_scan_heap(LVRelState *vacrel) if (got_cleanup_lock) lazy_scan_prune(vacrel, buf, blkno, page, vmbuffer, all_visible_according_to_vm, - &has_lpdead_items); + &has_lpdead_items, &vm_page_frozen); + + /* + * Count an eagerly scanned page as a failure or a success. + */ + if (was_eager_scanned) + { + /* Aggressive vacuums do not eager scan. */ + Assert(!vacrel->aggressive); + + if (vm_page_frozen) + { + Assert(vacrel->eager_scan_remaining_successes > 0); + vacrel->eager_scan_remaining_successes--; + + if (vacrel->eager_scan_remaining_successes == 0) + { + /* + * If we hit our success limit, there is no need to + * eagerly scan any additional pages. Permanently disable + * eager scanning by setting the other eager scan + * management fields to their disabled values. + */ + vacrel->eager_scan_remaining_fails = 0; + vacrel->next_eager_scan_region_start = InvalidBlockNumber; + vacrel->eager_scan_max_fails_per_region = 0; + + ereport(INFO, + (errmsg("Vacuum successfully froze %u eager scanned blocks of \"%s.%s.%s\". Now disabling eager scanning.", + orig_eager_scan_success_limit, + vacrel->dbname, vacrel->relnamespace, + vacrel->relname))); + } + } + else + { + Assert(vacrel->eager_scan_remaining_fails > 0); + vacrel->eager_scan_remaining_fails--; + } + } /* * Now drop the buffer lock and, potentially, update the FSM. @@ -1149,7 +1426,9 @@ lazy_scan_heap(LVRelState *vacrel) * * The block number and visibility status of the next block to process are set * in *blkno and *all_visible_according_to_vm. The return value is false if - * there are no further blocks to process. + * there are no further blocks to process. If the block is being eagerly + * scanned, was_eager_scanned is set so that the caller can count whether or + * not an eager scanned page is successfully frozen. * * vacrel is an in/out parameter here. Vacuum options and information about * the relation are read. vacrel->skippedallvis is set if we skip a block @@ -1159,13 +1438,16 @@ lazy_scan_heap(LVRelState *vacrel) */ static bool heap_vac_scan_next_block(LVRelState *vacrel, BlockNumber *blkno, - bool *all_visible_according_to_vm) + bool *all_visible_according_to_vm, + bool *was_eager_scanned) { BlockNumber next_block; /* relies on InvalidBlockNumber + 1 overflowing to 0 on first call */ next_block = vacrel->current_block + 1; + *was_eager_scanned = false; + /* Have we reached the end of the relation? */ if (next_block >= vacrel->rel_pages) { @@ -1238,6 +1520,9 @@ heap_vac_scan_next_block(LVRelState *vacrel, BlockNumber *blkno, *blkno = vacrel->current_block = next_block; *all_visible_according_to_vm = vacrel->next_unskippable_allvis; + *was_eager_scanned = vacrel->next_unskippable_eager_scanned; + if (*was_eager_scanned) + vacrel->eager_scanned_pages++; return true; } } @@ -1261,11 +1546,12 @@ find_next_unskippable_block(LVRelState *vacrel, bool *skipsallvis) BlockNumber rel_pages = vacrel->rel_pages; BlockNumber next_unskippable_block = vacrel->next_unskippable_block + 1; Buffer next_unskippable_vmbuffer = vacrel->next_unskippable_vmbuffer; + bool next_unskippable_eager_scanned = false; bool next_unskippable_allvis; *skipsallvis = false; - for (;;) + for (;; next_unskippable_block++) { uint8 mapbits = visibilitymap_get_status(vacrel->rel, next_unskippable_block, @@ -1273,6 +1559,19 @@ find_next_unskippable_block(LVRelState *vacrel, bool *skipsallvis) next_unskippable_allvis = (mapbits & VISIBILITYMAP_ALL_VISIBLE) != 0; + /* + * At the start of each eager scan region, normal vacuums with eager + * scanning enabled reset the failure counter, allowing vacuum to + * resume eager scanning if it had been suspended in the previous + * region. + */ + if (next_unskippable_block >= vacrel->next_eager_scan_region_start) + { + vacrel->eager_scan_remaining_fails = + vacrel->eager_scan_max_fails_per_region; + vacrel->next_eager_scan_region_start += VACUUM_EAGER_SCAN_REGION_SIZE; + } + /* * A block is unskippable if it is not all visible according to the * visibility map. @@ -1305,24 +1604,34 @@ find_next_unskippable_block(LVRelState *vacrel, bool *skipsallvis) * all-visible. They may still skip all-frozen pages, which can't * contain XIDs < OldestXmin (XIDs that aren't already frozen by now). */ - if ((mapbits & VISIBILITYMAP_ALL_FROZEN) == 0) - { - if (vacrel->aggressive) - break; + if (mapbits & VISIBILITYMAP_ALL_FROZEN) + continue; - /* - * All-visible block is safe to skip in non-aggressive case. But - * remember that the final range contains such a block for later. - */ - *skipsallvis = true; + /* + * Aggressive vacuums cannot skip all-visible pages that are not also + * all-frozen. Normal vacuums with eager scanning enabled only skip + * such pages if they have hit the failure limit for the current eager + * scan region. + */ + if (vacrel->aggressive || + vacrel->eager_scan_remaining_fails > 0) + { + if (!vacrel->aggressive) + next_unskippable_eager_scanned = true; + break; } - next_unskippable_block++; + /* + * All-visible blocks are safe to skip in a normal vacuum. But + * remember that the final range contains such a block for later. + */ + *skipsallvis = true; } /* write the local variables back to vacrel */ vacrel->next_unskippable_block = next_unskippable_block; vacrel->next_unskippable_allvis = next_unskippable_allvis; + vacrel->next_unskippable_eager_scanned = next_unskippable_eager_scanned; vacrel->next_unskippable_vmbuffer = next_unskippable_vmbuffer; } @@ -1353,6 +1662,10 @@ find_next_unskippable_block(LVRelState *vacrel, bool *skipsallvis) * lazy_scan_prune (or lazy_scan_noprune). Otherwise returns true, indicating * that lazy_scan_heap is done processing the page, releasing lock on caller's * behalf. + * + * No vm_page_frozen output parameter (like what is passed to + * lazy_scan_prune()) is passed here because empty pages are always frozen and + * thus could never be eager scanned. */ static bool lazy_scan_new_or_empty(LVRelState *vacrel, Buffer buf, BlockNumber blkno, @@ -1492,6 +1805,10 @@ cmpOffsetNumbers(const void *a, const void *b) * * *has_lpdead_items is set to true or false depending on whether, upon return * from this function, any LP_DEAD items are still present on the page. + * + * *vm_page_frozen is set to true if the page is newly set all-frozen in the + * VM. The caller currently only uses this for determining whether an eagerly + * scanned page was successfully set all-frozen. */ static void lazy_scan_prune(LVRelState *vacrel, @@ -1500,7 +1817,8 @@ lazy_scan_prune(LVRelState *vacrel, Page page, Buffer vmbuffer, bool all_visible_according_to_vm, - bool *has_lpdead_items) + bool *has_lpdead_items, + bool *vm_page_frozen) { Relation rel = vacrel->rel; PruneFreezeResult presult; @@ -1652,11 +1970,17 @@ lazy_scan_prune(LVRelState *vacrel, { vacrel->vm_new_visible_pages++; if (presult.all_frozen) + { vacrel->vm_new_visible_frozen_pages++; + *vm_page_frozen = true; + } } else if ((old_vmbits & VISIBILITYMAP_ALL_FROZEN) == 0 && presult.all_frozen) + { vacrel->vm_new_frozen_pages++; + *vm_page_frozen = true; + } } /* @@ -1744,6 +2068,7 @@ lazy_scan_prune(LVRelState *vacrel, { vacrel->vm_new_visible_pages++; vacrel->vm_new_visible_frozen_pages++; + *vm_page_frozen = true; } /* @@ -1751,7 +2076,10 @@ lazy_scan_prune(LVRelState *vacrel, * above, so we don't need to test the value of old_vmbits. */ else + { vacrel->vm_new_frozen_pages++; + *vm_page_frozen = true; + } } } diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c index 2640228bef4..655fd3fd853 100644 --- a/src/backend/commands/vacuum.c +++ b/src/backend/commands/vacuum.c @@ -69,6 +69,7 @@ int vacuum_multixact_freeze_min_age; int vacuum_multixact_freeze_table_age; int vacuum_failsafe_age; int vacuum_multixact_failsafe_age; +int vacuum_eager_scan_max_fails; /* * Variables for cost-based vacuum delay. The defaults differ between @@ -405,6 +406,9 @@ ExecVacuum(ParseState *pstate, VacuumStmt *vacstmt, bool isTopLevel) /* user-invoked vacuum uses VACOPT_VERBOSE instead of log_min_duration */ params.log_min_duration = -1; + /* Later we check if a reloption override was specified */ + params.eager_scan_max_fails = vacuum_eager_scan_max_fails; + /* * Create special memory context for cross-transaction storage. * @@ -2165,6 +2169,15 @@ vacuum_rel(Oid relid, RangeVar *relation, VacuumParams *params, } } + /* + * Check if the vacuum_eager_scan_max_fails table storage parameter was + * specified. This overrides the GUC value. + */ + if (rel->rd_options != NULL && + ((StdRdOptions *) rel->rd_options)->vacuum_eager_scan_max_fails >= 0) + params->eager_scan_max_fails = + ((StdRdOptions *) rel->rd_options)->vacuum_eager_scan_max_fails; + /* * Set truncate option based on truncate reloption if it wasn't specified * in VACUUM command, or when running in an autovacuum worker diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c index 0ab921a169b..1d5ab1c89bc 100644 --- a/src/backend/postmaster/autovacuum.c +++ b/src/backend/postmaster/autovacuum.c @@ -2826,6 +2826,8 @@ table_recheck_autovac(Oid relid, HTAB *table_toast_map, tab->at_params.is_wraparound = wraparound; tab->at_params.log_min_duration = log_min_duration; tab->at_params.toast_parent = InvalidOid; + /* Later we check reloptions for vacuum_eager_scan_max_fails override */ + tab->at_params.eager_scan_max_fails = vacuum_eager_scan_max_fails; tab->at_storage_param_vac_cost_limit = avopts ? avopts->vacuum_cost_limit : 0; tab->at_storage_param_vac_cost_delay = avopts ? diff --git a/src/backend/utils/misc/guc_tables.c b/src/backend/utils/misc/guc_tables.c index c9d8cd796a8..22e61ab70b1 100644 --- a/src/backend/utils/misc/guc_tables.c +++ b/src/backend/utils/misc/guc_tables.c @@ -2743,6 +2743,16 @@ struct config_int ConfigureNamesInt[] = NULL, NULL, NULL }, + { + {"vacuum_eager_scan_max_fails", PGC_USERSET, CLIENT_CONN_STATEMENT, + gettext_noop("Maximum number of all-visible pages vacuum can eager scan and fail to freeze before suspending eager scanning until the next region of the table"), + NULL + }, + &vacuum_eager_scan_max_fails, + 128, 0, VACUUM_EAGER_SCAN_REGION_SIZE, + NULL, NULL, NULL + }, + { {"vacuum_freeze_table_age", PGC_USERSET, CLIENT_CONN_STATEMENT, gettext_noop("Age at which VACUUM should scan whole table to freeze tuples."), diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample index b2bc43383db..7501d2a317b 100644 --- a/src/backend/utils/misc/postgresql.conf.sample +++ b/src/backend/utils/misc/postgresql.conf.sample @@ -728,6 +728,7 @@ autovacuum_worker_slots = 16 # autovacuum worker slots to allocate #vacuum_multixact_freeze_table_age = 150000000 #vacuum_multixact_freeze_min_age = 5000000 #vacuum_multixact_failsafe_age = 1600000000 +#vacuum_eager_scan_max_fails = 128 # 0 disables eager scanning #bytea_output = 'hex' # hex, escape #xmlbinary = 'base64' #xmloption = 'content' diff --git a/src/include/commands/vacuum.h b/src/include/commands/vacuum.h index 12d0b61950d..6b5e5e04818 100644 --- a/src/include/commands/vacuum.h +++ b/src/include/commands/vacuum.h @@ -231,6 +231,14 @@ typedef struct VacuumParams VacOptValue truncate; /* Truncate empty pages at the end */ Oid toast_parent; /* for privilege checks when recursing */ + /* + * The maximum number of all-visible pages that can be scanned and failed + * to be set all-frozen before eager scanning is disabled for the current + * region. Only applicable for table AMs using visibility maps. Derived + * from GUC or table storage parameter. 0 if disabled. + */ + uint32 eager_scan_max_fails; + /* * The number of parallel vacuum workers. 0 by default which means choose * based on the number of indexes. -1 indicates parallel vacuum is @@ -297,6 +305,21 @@ extern PGDLLIMPORT int vacuum_multixact_freeze_table_age; extern PGDLLIMPORT int vacuum_failsafe_age; extern PGDLLIMPORT int vacuum_multixact_failsafe_age; +/* + * Relevant for vacuums implementing eager scanning. Normal vacuums may eagerly + * scan some all-visible but not all-frozen pages. Since our goal is to freeze + * these pages, an eager scan that fails to set the page all-frozen in the VM + * is considered to have "failed". + * + * On the assumption that different regions of the table tend to have similarly + * aged data, once we fail to freeze vacuum_eager_scan_max_fails blocks in a + * region of size VACUUM_EAGER_SCAN_REGION_SIZE, we suspend eager scanning + * until vacuum has progressed to another region of the table with potentially + * older data. + */ +extern PGDLLIMPORT int vacuum_eager_scan_max_fails; +#define VACUUM_EAGER_SCAN_REGION_SIZE 4096 + /* * Maximum value for default_statistics_target and per-column statistics * targets. This is fairly arbitrary, mainly to prevent users from creating diff --git a/src/include/utils/rel.h b/src/include/utils/rel.h index 33d1e4a4e2e..d9fe68f4d86 100644 --- a/src/include/utils/rel.h +++ b/src/include/utils/rel.h @@ -343,6 +343,13 @@ typedef struct StdRdOptions int parallel_workers; /* max number of parallel workers */ StdRdOptIndexCleanup vacuum_index_cleanup; /* controls index vacuuming */ bool vacuum_truncate; /* enables vacuum to truncate a relation */ + + /* + * The maximum number of all-visible pages vacuum may scan and fail to + * freeze before eager scanning is disabled for the current region of the + * table. 0 if disabled, -1 if unspecified. + */ + int vacuum_eager_scan_max_fails; } StdRdOptions; #define HEAP_MIN_FILLFACTOR 10 -- 2.34.1
From 26ecb39b4b065ae466b9edd074a99d9d28fca476 Mon Sep 17 00:00:00 2001 From: Melanie Plageman <melanieplage...@gmail.com> Date: Tue, 7 Jan 2025 11:50:28 -0500 Subject: [PATCH v5 1/3] Consolidate docs for vacuum-related GUCs in new subsection GUCs related to vacuum's freezing behavior were documented in a subsection of the Client Connection Defaults documentation. These GUCs don't belong there, as they affect the freezing behavior of all vacuums -- including autovacuums. There wasn't a clear alternative location, so this commit makes a new Server Configuration docs subsection, "Vacuuming", with a subsection for "Freezing". It also moves the "Automatic Vacuuming" subsection and the docs on GUCs controlling cost-based vacuum delay under the new "Vacuuming" subsection. The other vacuum-related GUCs under the "Resource Consumption" subsection have been left in their current location, as they seem to fit there. Discussion: https://postgr.es/m/flat/1373018.1736213217%40sss.pgh.pa.us#105c713a7966f87e4ac4301246e3cabe --- doc/src/sgml/config.sgml | 1237 +++++++++++++++++++------------------- 1 file changed, 628 insertions(+), 609 deletions(-) diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml index 740ff5d5044..3b2430eff55 100644 --- a/doc/src/sgml/config.sgml +++ b/doc/src/sgml/config.sgml @@ -2367,149 +2367,6 @@ include_dir 'conf.d' </variablelist> </sect2> - <sect2 id="runtime-config-resource-vacuum-cost"> - <title>Cost-based Vacuum Delay</title> - - <para> - During the execution of <xref linkend="sql-vacuum"/> - and <xref linkend="sql-analyze"/> - commands, the system maintains an - internal counter that keeps track of the estimated cost of the - various I/O operations that are performed. When the accumulated - cost reaches a limit (specified by - <varname>vacuum_cost_limit</varname>), the process performing - the operation will sleep for a short period of time, as specified by - <varname>vacuum_cost_delay</varname>. Then it will reset the - counter and continue execution. - </para> - - <para> - The intent of this feature is to allow administrators to reduce - the I/O impact of these commands on concurrent database - activity. There are many situations where it is not - important that maintenance commands like - <command>VACUUM</command> and <command>ANALYZE</command> finish - quickly; however, it is usually very important that these - commands do not significantly interfere with the ability of the - system to perform other database operations. Cost-based vacuum - delay provides a way for administrators to achieve this. - </para> - - <para> - This feature is disabled by default for manually issued - <command>VACUUM</command> commands. To enable it, set the - <varname>vacuum_cost_delay</varname> variable to a nonzero - value. - </para> - - <variablelist> - <varlistentry id="guc-vacuum-cost-delay" xreflabel="vacuum_cost_delay"> - <term><varname>vacuum_cost_delay</varname> (<type>floating point</type>) - <indexterm> - <primary><varname>vacuum_cost_delay</varname> configuration parameter</primary> - </indexterm> - </term> - <listitem> - <para> - The amount of time that the process will sleep - when the cost limit has been exceeded. - If this value is specified without units, it is taken as milliseconds. - The default value is zero, which disables the cost-based vacuum - delay feature. Positive values enable cost-based vacuuming. - </para> - - <para> - When using cost-based vacuuming, appropriate values for - <varname>vacuum_cost_delay</varname> are usually quite small, perhaps - less than 1 millisecond. While <varname>vacuum_cost_delay</varname> - can be set to fractional-millisecond values, such delays may not be - measured accurately on older platforms. On such platforms, - increasing <command>VACUUM</command>'s throttled resource consumption - above what you get at 1ms will require changing the other vacuum cost - parameters. You should, nonetheless, - keep <varname>vacuum_cost_delay</varname> as small as your platform - will consistently measure; large delays are not helpful. - </para> - </listitem> - </varlistentry> - - <varlistentry id="guc-vacuum-cost-page-hit" xreflabel="vacuum_cost_page_hit"> - <term><varname>vacuum_cost_page_hit</varname> (<type>integer</type>) - <indexterm> - <primary><varname>vacuum_cost_page_hit</varname> configuration parameter</primary> - </indexterm> - </term> - <listitem> - <para> - The estimated cost for vacuuming a buffer found in the shared buffer - cache. It represents the cost to lock the buffer pool, lookup - the shared hash table and scan the content of the page. The - default value is one. - </para> - </listitem> - </varlistentry> - - <varlistentry id="guc-vacuum-cost-page-miss" xreflabel="vacuum_cost_page_miss"> - <term><varname>vacuum_cost_page_miss</varname> (<type>integer</type>) - <indexterm> - <primary><varname>vacuum_cost_page_miss</varname> configuration parameter</primary> - </indexterm> - </term> - <listitem> - <para> - The estimated cost for vacuuming a buffer that has to be read from - disk. This represents the effort to lock the buffer pool, - lookup the shared hash table, read the desired block in from - the disk and scan its content. The default value is 2. - </para> - </listitem> - </varlistentry> - - <varlistentry id="guc-vacuum-cost-page-dirty" xreflabel="vacuum_cost_page_dirty"> - <term><varname>vacuum_cost_page_dirty</varname> (<type>integer</type>) - <indexterm> - <primary><varname>vacuum_cost_page_dirty</varname> configuration parameter</primary> - </indexterm> - </term> - <listitem> - <para> - The estimated cost charged when vacuum modifies a block that was - previously clean. It represents the extra I/O required to - flush the dirty block out to disk again. The default value is - 20. - </para> - </listitem> - </varlistentry> - - <varlistentry id="guc-vacuum-cost-limit" xreflabel="vacuum_cost_limit"> - <term><varname>vacuum_cost_limit</varname> (<type>integer</type>) - <indexterm> - <primary><varname>vacuum_cost_limit</varname> configuration parameter</primary> - </indexterm> - </term> - <listitem> - <para> - This is the accumulated cost that will cause the vacuuming process to sleep - for <varname>vacuum_cost_delay</varname>. The default is 200. - </para> - </listitem> - </varlistentry> - </variablelist> - - <note> - <para> - There are certain operations that hold critical locks and should - therefore complete as quickly as possible. Cost-based vacuum - delays do not occur during such operations. Therefore it is - possible that the cost accumulates far higher than the specified - limit. To avoid uselessly long delays in such cases, the actual - delay is calculated as <varname>vacuum_cost_delay</varname> * - <varname>accumulated_balance</varname> / - <varname>vacuum_cost_limit</varname> with a maximum of - <varname>vacuum_cost_delay</varname> * 4. - </para> - </note> - </sect2> <sect2 id="runtime-config-resource-background-writer"> <title>Background Writer</title> @@ -8588,14 +8445,17 @@ COPY postgres_log FROM '/full/path/to/logfile.csv' WITH csv; </sect2> </sect1> - <sect1 id="runtime-config-autovacuum"> - <title>Automatic Vacuuming</title> + <sect1 id="runtime-config-vacuum"> + <title>Vacuuming</title> <indexterm> - <primary>autovacuum</primary> + <primary>vacuum</primary> <secondary>configuration parameters</secondary> </indexterm> + <sect2 id="runtime-config-autovacuum"> + <title>Automatic Vacuuming</title> + <para> These settings control the behavior of the <firstterm>autovacuum</firstterm> feature. Refer to <xref linkend="autovacuum"/> for more information. @@ -8603,324 +8463,645 @@ COPY postgres_log FROM '/full/path/to/logfile.csv' WITH csv; basis; see <xref linkend="sql-createtable-storage-parameters"/>. </para> - <variablelist> + <variablelist> - <varlistentry id="guc-autovacuum" xreflabel="autovacuum"> - <term><varname>autovacuum</varname> (<type>boolean</type>) - <indexterm> - <primary><varname>autovacuum</varname> configuration parameter</primary> - </indexterm> - </term> - <listitem> - <para> - Controls whether the server should run the - autovacuum launcher daemon. This is on by default; however, - <xref linkend="guc-track-counts"/> must also be enabled for - autovacuum to work. - This parameter can only be set in the <filename>postgresql.conf</filename> - file or on the server command line; however, autovacuuming can be - disabled for individual tables by changing table storage parameters. - </para> - <para> - Note that even when this parameter is disabled, the system - will launch autovacuum processes if necessary to - prevent transaction ID wraparound. See <xref - linkend="vacuum-for-wraparound"/> for more information. - </para> - </listitem> - </varlistentry> + <varlistentry id="guc-autovacuum" xreflabel="autovacuum"> + <term><varname>autovacuum</varname> (<type>boolean</type>) + <indexterm> + <primary><varname>autovacuum</varname> configuration parameter</primary> + </indexterm> + </term> + <listitem> + <para> + Controls whether the server should run the + autovacuum launcher daemon. This is on by default; however, + <xref linkend="guc-track-counts"/> must also be enabled for + autovacuum to work. + This parameter can only be set in the <filename>postgresql.conf</filename> + file or on the server command line; however, autovacuuming can be + disabled for individual tables by changing table storage parameters. + </para> + <para> + Note that even when this parameter is disabled, the system + will launch autovacuum processes if necessary to + prevent transaction ID wraparound. See <xref + linkend="vacuum-for-wraparound"/> for more information. + </para> + </listitem> + </varlistentry> - <varlistentry id="guc-autovacuum-worker-slots" xreflabel="autovacuum_worker_slots"> - <term><varname>autovacuum_worker_slots</varname> (<type>integer</type>) - <indexterm> - <primary><varname>autovacuum_worker_slots</varname> configuration parameter</primary> - </indexterm> - </term> - <listitem> - <para> - Specifies the number of backend slots to reserve for autovacuum worker - processes. The default is 16. This parameter can only be set at server - start. - </para> - <para> - When changing this value, consider also adjusting - <xref linkend="guc-autovacuum-max-workers"/>. - </para> - </listitem> - </varlistentry> + <varlistentry id="guc-autovacuum-worker-slots" xreflabel="autovacuum_worker_slots"> + <term><varname>autovacuum_worker_slots</varname> (<type>integer</type>) + <indexterm> + <primary><varname>autovacuum_worker_slots</varname> configuration parameter</primary> + </indexterm> + </term> + <listitem> + <para> + Specifies the number of backend slots to reserve for autovacuum worker + processes. The default is 16. This parameter can only be set at server + start. + </para> + <para> + When changing this value, consider also adjusting + <xref linkend="guc-autovacuum-max-workers"/>. + </para> + </listitem> + </varlistentry> - <varlistentry id="guc-autovacuum-max-workers" xreflabel="autovacuum_max_workers"> - <term><varname>autovacuum_max_workers</varname> (<type>integer</type>) - <indexterm> - <primary><varname>autovacuum_max_workers</varname> configuration parameter</primary> - </indexterm> - </term> - <listitem> - <para> - Specifies the maximum number of autovacuum processes (other than the - autovacuum launcher) that may be running at any one time. The default - is three. This parameter can only be set in the - <filename>postgresql.conf</filename> file or on the server command line. - </para> - <para> - Note that a setting for this value which is higher than - <xref linkend="guc-autovacuum-worker-slots"/> will have no effect, - since autovacuum workers are taken from the pool of slots established - by that setting. - </para> - </listitem> - </varlistentry> + <varlistentry id="guc-autovacuum-max-workers" xreflabel="autovacuum_max_workers"> + <term><varname>autovacuum_max_workers</varname> (<type>integer</type>) + <indexterm> + <primary><varname>autovacuum_max_workers</varname> configuration parameter</primary> + </indexterm> + </term> + <listitem> + <para> + Specifies the maximum number of autovacuum processes (other than the + autovacuum launcher) that may be running at any one time. The default + is three. This parameter can only be set in the + <filename>postgresql.conf</filename> file or on the server command line. + </para> + <para> + Note that a setting for this value which is higher than + <xref linkend="guc-autovacuum-worker-slots"/> will have no effect, + since autovacuum workers are taken from the pool of slots established + by that setting. + </para> + </listitem> + </varlistentry> - <varlistentry id="guc-autovacuum-naptime" xreflabel="autovacuum_naptime"> - <term><varname>autovacuum_naptime</varname> (<type>integer</type>) - <indexterm> - <primary><varname>autovacuum_naptime</varname> configuration parameter</primary> - </indexterm> - </term> - <listitem> - <para> - Specifies the minimum delay between autovacuum runs on any given - database. In each round the daemon examines the - database and issues <command>VACUUM</command> and <command>ANALYZE</command> commands - as needed for tables in that database. - If this value is specified without units, it is taken as seconds. - The default is one minute (<literal>1min</literal>). - This parameter can only be set in the <filename>postgresql.conf</filename> - file or on the server command line. - </para> - </listitem> - </varlistentry> + <varlistentry id="guc-autovacuum-naptime" xreflabel="autovacuum_naptime"> + <term><varname>autovacuum_naptime</varname> (<type>integer</type>) + <indexterm> + <primary><varname>autovacuum_naptime</varname> configuration parameter</primary> + </indexterm> + </term> + <listitem> + <para> + Specifies the minimum delay between autovacuum runs on any given + database. In each round the daemon examines the + database and issues <command>VACUUM</command> and <command>ANALYZE</command> commands + as needed for tables in that database. + If this value is specified without units, it is taken as seconds. + The default is one minute (<literal>1min</literal>). + This parameter can only be set in the <filename>postgresql.conf</filename> + file or on the server command line. + </para> + </listitem> + </varlistentry> - <varlistentry id="guc-autovacuum-vacuum-threshold" xreflabel="autovacuum_vacuum_threshold"> - <term><varname>autovacuum_vacuum_threshold</varname> (<type>integer</type>) - <indexterm> - <primary><varname>autovacuum_vacuum_threshold</varname></primary> - <secondary>configuration parameter</secondary> - </indexterm> - </term> - <listitem> - <para> - Specifies the minimum number of updated or deleted tuples needed - to trigger a <command>VACUUM</command> in any one table. - The default is 50 tuples. - This parameter can only be set in the <filename>postgresql.conf</filename> - file or on the server command line; - but the setting can be overridden for individual tables by - changing table storage parameters. - </para> - </listitem> - </varlistentry> + <varlistentry id="guc-autovacuum-vacuum-threshold" xreflabel="autovacuum_vacuum_threshold"> + <term><varname>autovacuum_vacuum_threshold</varname> (<type>integer</type>) + <indexterm> + <primary><varname>autovacuum_vacuum_threshold</varname></primary> + <secondary>configuration parameter</secondary> + </indexterm> + </term> + <listitem> + <para> + Specifies the minimum number of updated or deleted tuples needed + to trigger a <command>VACUUM</command> in any one table. + The default is 50 tuples. + This parameter can only be set in the <filename>postgresql.conf</filename> + file or on the server command line; + but the setting can be overridden for individual tables by + changing table storage parameters. + </para> + </listitem> + </varlistentry> - <varlistentry id="guc-autovacuum-vacuum-insert-threshold" xreflabel="autovacuum_vacuum_insert_threshold"> - <term><varname>autovacuum_vacuum_insert_threshold</varname> (<type>integer</type>) - <indexterm> - <primary><varname>autovacuum_vacuum_insert_threshold</varname></primary> - <secondary>configuration parameter</secondary> - </indexterm> - </term> - <listitem> - <para> - Specifies the number of inserted tuples needed to trigger a - <command>VACUUM</command> in any one table. - The default is 1000 tuples. If -1 is specified, autovacuum will not - trigger a <command>VACUUM</command> operation on any tables based on - the number of inserts. - This parameter can only be set in the <filename>postgresql.conf</filename> - file or on the server command line; - but the setting can be overridden for individual tables by - changing table storage parameters. - </para> - </listitem> - </varlistentry> + <varlistentry id="guc-autovacuum-vacuum-insert-threshold" xreflabel="autovacuum_vacuum_insert_threshold"> + <term><varname>autovacuum_vacuum_insert_threshold</varname> (<type>integer</type>) + <indexterm> + <primary><varname>autovacuum_vacuum_insert_threshold</varname></primary> + <secondary>configuration parameter</secondary> + </indexterm> + </term> + <listitem> + <para> + Specifies the number of inserted tuples needed to trigger a + <command>VACUUM</command> in any one table. + The default is 1000 tuples. If -1 is specified, autovacuum will not + trigger a <command>VACUUM</command> operation on any tables based on + the number of inserts. + This parameter can only be set in the <filename>postgresql.conf</filename> + file or on the server command line; + but the setting can be overridden for individual tables by + changing table storage parameters. + </para> + </listitem> + </varlistentry> - <varlistentry id="guc-autovacuum-analyze-threshold" xreflabel="autovacuum_analyze_threshold"> - <term><varname>autovacuum_analyze_threshold</varname> (<type>integer</type>) - <indexterm> - <primary><varname>autovacuum_analyze_threshold</varname></primary> - <secondary>configuration parameter</secondary> - </indexterm> - </term> - <listitem> - <para> - Specifies the minimum number of inserted, updated or deleted tuples - needed to trigger an <command>ANALYZE</command> in any one table. - The default is 50 tuples. - This parameter can only be set in the <filename>postgresql.conf</filename> - file or on the server command line; - but the setting can be overridden for individual tables by - changing table storage parameters. - </para> - </listitem> - </varlistentry> + <varlistentry id="guc-autovacuum-analyze-threshold" xreflabel="autovacuum_analyze_threshold"> + <term><varname>autovacuum_analyze_threshold</varname> (<type>integer</type>) + <indexterm> + <primary><varname>autovacuum_analyze_threshold</varname></primary> + <secondary>configuration parameter</secondary> + </indexterm> + </term> + <listitem> + <para> + Specifies the minimum number of inserted, updated or deleted tuples + needed to trigger an <command>ANALYZE</command> in any one table. + The default is 50 tuples. + This parameter can only be set in the <filename>postgresql.conf</filename> + file or on the server command line; + but the setting can be overridden for individual tables by + changing table storage parameters. + </para> + </listitem> + </varlistentry> - <varlistentry id="guc-autovacuum-vacuum-scale-factor" xreflabel="autovacuum_vacuum_scale_factor"> - <term><varname>autovacuum_vacuum_scale_factor</varname> (<type>floating point</type>) - <indexterm> - <primary><varname>autovacuum_vacuum_scale_factor</varname></primary> - <secondary>configuration parameter</secondary> - </indexterm> - </term> - <listitem> - <para> - Specifies a fraction of the table size to add to - <varname>autovacuum_vacuum_threshold</varname> - when deciding whether to trigger a <command>VACUUM</command>. - The default is 0.2 (20% of table size). - This parameter can only be set in the <filename>postgresql.conf</filename> - file or on the server command line; - but the setting can be overridden for individual tables by - changing table storage parameters. - </para> - </listitem> - </varlistentry> + <varlistentry id="guc-autovacuum-vacuum-scale-factor" xreflabel="autovacuum_vacuum_scale_factor"> + <term><varname>autovacuum_vacuum_scale_factor</varname> (<type>floating point</type>) + <indexterm> + <primary><varname>autovacuum_vacuum_scale_factor</varname></primary> + <secondary>configuration parameter</secondary> + </indexterm> + </term> + <listitem> + <para> + Specifies a fraction of the table size to add to + <varname>autovacuum_vacuum_threshold</varname> + when deciding whether to trigger a <command>VACUUM</command>. + The default is 0.2 (20% of table size). + This parameter can only be set in the <filename>postgresql.conf</filename> + file or on the server command line; + but the setting can be overridden for individual tables by + changing table storage parameters. + </para> + </listitem> + </varlistentry> - <varlistentry id="guc-autovacuum-vacuum-insert-scale-factor" xreflabel="autovacuum_vacuum_insert_scale_factor"> - <term><varname>autovacuum_vacuum_insert_scale_factor</varname> (<type>floating point</type>) - <indexterm> - <primary><varname>autovacuum_vacuum_insert_scale_factor</varname></primary> - <secondary>configuration parameter</secondary> - </indexterm> - </term> - <listitem> - <para> - Specifies a fraction of the table size to add to - <varname>autovacuum_vacuum_insert_threshold</varname> - when deciding whether to trigger a <command>VACUUM</command>. - The default is 0.2 (20% of table size). - This parameter can only be set in the <filename>postgresql.conf</filename> - file or on the server command line; - but the setting can be overridden for individual tables by - changing table storage parameters. - </para> - </listitem> - </varlistentry> + <varlistentry id="guc-autovacuum-vacuum-insert-scale-factor" xreflabel="autovacuum_vacuum_insert_scale_factor"> + <term><varname>autovacuum_vacuum_insert_scale_factor</varname> (<type>floating point</type>) + <indexterm> + <primary><varname>autovacuum_vacuum_insert_scale_factor</varname></primary> + <secondary>configuration parameter</secondary> + </indexterm> + </term> + <listitem> + <para> + Specifies a fraction of the table size to add to + <varname>autovacuum_vacuum_insert_threshold</varname> + when deciding whether to trigger a <command>VACUUM</command>. + The default is 0.2 (20% of table size). + This parameter can only be set in the <filename>postgresql.conf</filename> + file or on the server command line; + but the setting can be overridden for individual tables by + changing table storage parameters. + </para> + </listitem> + </varlistentry> - <varlistentry id="guc-autovacuum-analyze-scale-factor" xreflabel="autovacuum_analyze_scale_factor"> - <term><varname>autovacuum_analyze_scale_factor</varname> (<type>floating point</type>) - <indexterm> - <primary><varname>autovacuum_analyze_scale_factor</varname></primary> - <secondary>configuration parameter</secondary> - </indexterm> - </term> - <listitem> - <para> - Specifies a fraction of the table size to add to - <varname>autovacuum_analyze_threshold</varname> - when deciding whether to trigger an <command>ANALYZE</command>. - The default is 0.1 (10% of table size). - This parameter can only be set in the <filename>postgresql.conf</filename> - file or on the server command line; - but the setting can be overridden for individual tables by - changing table storage parameters. - </para> - </listitem> - </varlistentry> + <varlistentry id="guc-autovacuum-analyze-scale-factor" xreflabel="autovacuum_analyze_scale_factor"> + <term><varname>autovacuum_analyze_scale_factor</varname> (<type>floating point</type>) + <indexterm> + <primary><varname>autovacuum_analyze_scale_factor</varname></primary> + <secondary>configuration parameter</secondary> + </indexterm> + </term> + <listitem> + <para> + Specifies a fraction of the table size to add to + <varname>autovacuum_analyze_threshold</varname> + when deciding whether to trigger an <command>ANALYZE</command>. + The default is 0.1 (10% of table size). + This parameter can only be set in the <filename>postgresql.conf</filename> + file or on the server command line; + but the setting can be overridden for individual tables by + changing table storage parameters. + </para> + </listitem> + </varlistentry> - <varlistentry id="guc-autovacuum-freeze-max-age" xreflabel="autovacuum_freeze_max_age"> - <term><varname>autovacuum_freeze_max_age</varname> (<type>integer</type>) - <indexterm> - <primary><varname>autovacuum_freeze_max_age</varname></primary> - <secondary>configuration parameter</secondary> - </indexterm> - </term> - <listitem> - <para> - Specifies the maximum age (in transactions) that a table's - <structname>pg_class</structname>.<structfield>relfrozenxid</structfield> field can - attain before a <command>VACUUM</command> operation is forced - to prevent transaction ID wraparound within the table. - Note that the system will launch autovacuum processes to - prevent wraparound even when autovacuum is otherwise disabled. - </para> + <varlistentry id="guc-autovacuum-freeze-max-age" xreflabel="autovacuum_freeze_max_age"> + <term><varname>autovacuum_freeze_max_age</varname> (<type>integer</type>) + <indexterm> + <primary><varname>autovacuum_freeze_max_age</varname></primary> + <secondary>configuration parameter</secondary> + </indexterm> + </term> + <listitem> + <para> + Specifies the maximum age (in transactions) that a table's + <structname>pg_class</structname>.<structfield>relfrozenxid</structfield> field can + attain before a <command>VACUUM</command> operation is forced + to prevent transaction ID wraparound within the table. + Note that the system will launch autovacuum processes to + prevent wraparound even when autovacuum is otherwise disabled. + </para> - <para> - Vacuum also allows removal of old files from the - <filename>pg_xact</filename> subdirectory, which is why the default - is a relatively low 200 million transactions. - This parameter can only be set at server start, but the setting - can be reduced for individual tables by - changing table storage parameters. - For more information see <xref linkend="vacuum-for-wraparound"/>. - </para> - </listitem> - </varlistentry> + <para> + Vacuum also allows removal of old files from the + <filename>pg_xact</filename> subdirectory, which is why the default + is a relatively low 200 million transactions. + This parameter can only be set at server start, but the setting + can be reduced for individual tables by + changing table storage parameters. + For more information see <xref linkend="vacuum-for-wraparound"/>. + </para> + </listitem> + </varlistentry> - <varlistentry id="guc-autovacuum-multixact-freeze-max-age" xreflabel="autovacuum_multixact_freeze_max_age"> - <term><varname>autovacuum_multixact_freeze_max_age</varname> (<type>integer</type>) - <indexterm> - <primary><varname>autovacuum_multixact_freeze_max_age</varname></primary> - <secondary>configuration parameter</secondary> - </indexterm> - </term> - <listitem> - <para> - Specifies the maximum age (in multixacts) that a table's - <structname>pg_class</structname>.<structfield>relminmxid</structfield> field can - attain before a <command>VACUUM</command> operation is forced to - prevent multixact ID wraparound within the table. - Note that the system will launch autovacuum processes to - prevent wraparound even when autovacuum is otherwise disabled. - </para> + <varlistentry id="guc-autovacuum-multixact-freeze-max-age" xreflabel="autovacuum_multixact_freeze_max_age"> + <term><varname>autovacuum_multixact_freeze_max_age</varname> (<type>integer</type>) + <indexterm> + <primary><varname>autovacuum_multixact_freeze_max_age</varname></primary> + <secondary>configuration parameter</secondary> + </indexterm> + </term> + <listitem> + <para> + Specifies the maximum age (in multixacts) that a table's + <structname>pg_class</structname>.<structfield>relminmxid</structfield> field can + attain before a <command>VACUUM</command> operation is forced to + prevent multixact ID wraparound within the table. + Note that the system will launch autovacuum processes to + prevent wraparound even when autovacuum is otherwise disabled. + </para> - <para> - Vacuuming multixacts also allows removal of old files from the - <filename>pg_multixact/members</filename> and <filename>pg_multixact/offsets</filename> - subdirectories, which is why the default is a relatively low - 400 million multixacts. - This parameter can only be set at server start, but the setting can - be reduced for individual tables by changing table storage parameters. - For more information see <xref linkend="vacuum-for-multixact-wraparound"/>. - </para> - </listitem> - </varlistentry> + <para> + Vacuuming multixacts also allows removal of old files from the + <filename>pg_multixact/members</filename> and <filename>pg_multixact/offsets</filename> + subdirectories, which is why the default is a relatively low + 400 million multixacts. + This parameter can only be set at server start, but the setting can + be reduced for individual tables by changing table storage parameters. + For more information see <xref linkend="vacuum-for-multixact-wraparound"/>. + </para> + </listitem> + </varlistentry> - <varlistentry id="guc-autovacuum-vacuum-cost-delay" xreflabel="autovacuum_vacuum_cost_delay"> - <term><varname>autovacuum_vacuum_cost_delay</varname> (<type>floating point</type>) - <indexterm> - <primary><varname>autovacuum_vacuum_cost_delay</varname></primary> - <secondary>configuration parameter</secondary> - </indexterm> - </term> - <listitem> - <para> - Specifies the cost delay value that will be used in automatic - <command>VACUUM</command> operations. If -1 is specified, the regular - <xref linkend="guc-vacuum-cost-delay"/> value will be used. - If this value is specified without units, it is taken as milliseconds. - The default value is 2 milliseconds. - This parameter can only be set in the <filename>postgresql.conf</filename> - file or on the server command line; - but the setting can be overridden for individual tables by - changing table storage parameters. - </para> - </listitem> - </varlistentry> + <varlistentry id="guc-autovacuum-vacuum-cost-delay" xreflabel="autovacuum_vacuum_cost_delay"> + <term><varname>autovacuum_vacuum_cost_delay</varname> (<type>floating point</type>) + <indexterm> + <primary><varname>autovacuum_vacuum_cost_delay</varname></primary> + <secondary>configuration parameter</secondary> + </indexterm> + </term> + <listitem> + <para> + Specifies the cost delay value that will be used in automatic + <command>VACUUM</command> operations. If -1 is specified, the regular + <xref linkend="guc-vacuum-cost-delay"/> value will be used. + If this value is specified without units, it is taken as milliseconds. + The default value is 2 milliseconds. + This parameter can only be set in the <filename>postgresql.conf</filename> + file or on the server command line; + but the setting can be overridden for individual tables by + changing table storage parameters. + </para> + </listitem> + </varlistentry> - <varlistentry id="guc-autovacuum-vacuum-cost-limit" xreflabel="autovacuum_vacuum_cost_limit"> - <term><varname>autovacuum_vacuum_cost_limit</varname> (<type>integer</type>) - <indexterm> - <primary><varname>autovacuum_vacuum_cost_limit</varname></primary> - <secondary>configuration parameter</secondary> - </indexterm> - </term> - <listitem> - <para> - Specifies the cost limit value that will be used in automatic - <command>VACUUM</command> operations. If -1 is specified (which is the - default), the regular - <xref linkend="guc-vacuum-cost-limit"/> value will be used. Note that - the value is distributed proportionally among the running autovacuum - workers, if there is more than one, so that the sum of the limits for - each worker does not exceed the value of this variable. - This parameter can only be set in the <filename>postgresql.conf</filename> - file or on the server command line; - but the setting can be overridden for individual tables by - changing table storage parameters. - </para> - </listitem> - </varlistentry> + <varlistentry id="guc-autovacuum-vacuum-cost-limit" xreflabel="autovacuum_vacuum_cost_limit"> + <term><varname>autovacuum_vacuum_cost_limit</varname> (<type>integer</type>) + <indexterm> + <primary><varname>autovacuum_vacuum_cost_limit</varname></primary> + <secondary>configuration parameter</secondary> + </indexterm> + </term> + <listitem> + <para> + Specifies the cost limit value that will be used in automatic + <command>VACUUM</command> operations. If -1 is specified (which is the + default), the regular + <xref linkend="guc-vacuum-cost-limit"/> value will be used. Note that + the value is distributed proportionally among the running autovacuum + workers, if there is more than one, so that the sum of the limits for + each worker does not exceed the value of this variable. + This parameter can only be set in the <filename>postgresql.conf</filename> + file or on the server command line; + but the setting can be overridden for individual tables by + changing table storage parameters. + </para> + </listitem> + </varlistentry> - </variablelist> + </variablelist> + </sect2> + + <sect2 id="runtime-config-resource-vacuum-cost"> + <title>Cost-based Vacuum Delay</title> + + <para> + During the execution of <xref linkend="sql-vacuum"/> + and <xref linkend="sql-analyze"/> + commands, the system maintains an + internal counter that keeps track of the estimated cost of the + various I/O operations that are performed. When the accumulated + cost reaches a limit (specified by + <varname>vacuum_cost_limit</varname>), the process performing + the operation will sleep for a short period of time, as specified by + <varname>vacuum_cost_delay</varname>. Then it will reset the + counter and continue execution. + </para> + + <para> + The intent of this feature is to allow administrators to reduce + the I/O impact of these commands on concurrent database + activity. There are many situations where it is not + important that maintenance commands like + <command>VACUUM</command> and <command>ANALYZE</command> finish + quickly; however, it is usually very important that these + commands do not significantly interfere with the ability of the + system to perform other database operations. Cost-based vacuum + delay provides a way for administrators to achieve this. + </para> + + <para> + This feature is disabled by default for manually issued + <command>VACUUM</command> commands. To enable it, set the + <varname>vacuum_cost_delay</varname> variable to a nonzero + value. + </para> + + <variablelist> + <varlistentry id="guc-vacuum-cost-delay" xreflabel="vacuum_cost_delay"> + <term><varname>vacuum_cost_delay</varname> (<type>floating point</type>) + <indexterm> + <primary><varname>vacuum_cost_delay</varname> configuration parameter</primary> + </indexterm> + </term> + <listitem> + <para> + The amount of time that the process will sleep + when the cost limit has been exceeded. + If this value is specified without units, it is taken as milliseconds. + The default value is zero, which disables the cost-based vacuum + delay feature. Positive values enable cost-based vacuuming. + </para> + + <para> + When using cost-based vacuuming, appropriate values for + <varname>vacuum_cost_delay</varname> are usually quite small, perhaps + less than 1 millisecond. While <varname>vacuum_cost_delay</varname> + can be set to fractional-millisecond values, such delays may not be + measured accurately on older platforms. On such platforms, + increasing <command>VACUUM</command>'s throttled resource consumption + above what you get at 1ms will require changing the other vacuum cost + parameters. You should, nonetheless, + keep <varname>vacuum_cost_delay</varname> as small as your platform + will consistently measure; large delays are not helpful. + </para> + </listitem> + </varlistentry> + + <varlistentry id="guc-vacuum-cost-page-hit" xreflabel="vacuum_cost_page_hit"> + <term><varname>vacuum_cost_page_hit</varname> (<type>integer</type>) + <indexterm> + <primary><varname>vacuum_cost_page_hit</varname> configuration parameter</primary> + </indexterm> + </term> + <listitem> + <para> + The estimated cost for vacuuming a buffer found in the shared buffer + cache. It represents the cost to lock the buffer pool, lookup + the shared hash table and scan the content of the page. The + default value is one. + </para> + </listitem> + </varlistentry> + + <varlistentry id="guc-vacuum-cost-page-miss" xreflabel="vacuum_cost_page_miss"> + <term><varname>vacuum_cost_page_miss</varname> (<type>integer</type>) + <indexterm> + <primary><varname>vacuum_cost_page_miss</varname> configuration parameter</primary> + </indexterm> + </term> + <listitem> + <para> + The estimated cost for vacuuming a buffer that has to be read from + disk. This represents the effort to lock the buffer pool, + lookup the shared hash table, read the desired block in from + the disk and scan its content. The default value is 2. + </para> + </listitem> + </varlistentry> + + <varlistentry id="guc-vacuum-cost-page-dirty" xreflabel="vacuum_cost_page_dirty"> + <term><varname>vacuum_cost_page_dirty</varname> (<type>integer</type>) + <indexterm> + <primary><varname>vacuum_cost_page_dirty</varname> configuration parameter</primary> + </indexterm> + </term> + <listitem> + <para> + The estimated cost charged when vacuum modifies a block that was + previously clean. It represents the extra I/O required to + flush the dirty block out to disk again. The default value is + 20. + </para> + </listitem> + </varlistentry> + + <varlistentry id="guc-vacuum-cost-limit" xreflabel="vacuum_cost_limit"> + <term><varname>vacuum_cost_limit</varname> (<type>integer</type>) + <indexterm> + <primary><varname>vacuum_cost_limit</varname> configuration parameter</primary> + </indexterm> + </term> + <listitem> + <para> + This is the accumulated cost that will cause the vacuuming process to sleep + for <varname>vacuum_cost_delay</varname>. The default is 200. + </para> + </listitem> + </varlistentry> + </variablelist> + + <note> + <para> + There are certain operations that hold critical locks and should + therefore complete as quickly as possible. Cost-based vacuum + delays do not occur during such operations. Therefore it is + possible that the cost accumulates far higher than the specified + limit. To avoid uselessly long delays in such cases, the actual + delay is calculated as <varname>vacuum_cost_delay</varname> * + <varname>accumulated_balance</varname> / + <varname>vacuum_cost_limit</varname> with a maximum of + <varname>vacuum_cost_delay</varname> * 4. + </para> + </note> + </sect2> + + <sect2 id="runtime-config-vacuum-freezing"> + <title>Freezing</title> + + <para> + Vacuum operations are also responsible for freezing rows to avoid + transaction ID wraparound. These settings control vacuum's freezing + behavior. See <xref linkend="vacuum-for-wraparound"/> for more + information on transaction ID wraparound and tuning these parameters. + </para> + + <variablelist> + <varlistentry id="guc-vacuum-freeze-table-age" xreflabel="vacuum_freeze_table_age"> + <term><varname>vacuum_freeze_table_age</varname> (<type>integer</type>) + <indexterm> + <primary><varname>vacuum_freeze_table_age</varname> configuration parameter</primary> + </indexterm> + </term> + <listitem> + <para> + <command>VACUUM</command> performs an aggressive scan if the table's + <structname>pg_class</structname>.<structfield>relfrozenxid</structfield> field has reached + the age specified by this setting. An aggressive scan differs from + a regular <command>VACUUM</command> in that it visits every page that might + contain unfrozen XIDs or MXIDs, not just those that might contain dead + tuples. The default is 150 million transactions. Although users can + set this value anywhere from zero to two billion, <command>VACUUM</command> + will silently limit the effective value to 95% of + <xref linkend="guc-autovacuum-freeze-max-age"/>, so that a + periodic manual <command>VACUUM</command> has a chance to run before an + anti-wraparound autovacuum is launched for the table. For more + information see + <xref linkend="vacuum-for-wraparound"/>. + </para> + </listitem> + </varlistentry> + + <varlistentry id="guc-vacuum-freeze-min-age" xreflabel="vacuum_freeze_min_age"> + <term><varname>vacuum_freeze_min_age</varname> (<type>integer</type>) + <indexterm> + <primary><varname>vacuum_freeze_min_age</varname> configuration parameter</primary> + </indexterm> + </term> + <listitem> + <para> + Specifies the cutoff age (in transactions) that + <command>VACUUM</command> should use to decide whether to + trigger freezing of pages that have an older XID. + The default is 50 million transactions. Although + users can set this value anywhere from zero to one billion, + <command>VACUUM</command> will silently limit the effective value to half + the value of <xref linkend="guc-autovacuum-freeze-max-age"/>, so + that there is not an unreasonably short time between forced + autovacuums. For more information see <xref + linkend="vacuum-for-wraparound"/>. + </para> + </listitem> + </varlistentry> + + <varlistentry id="guc-vacuum-failsafe-age" xreflabel="vacuum_failsafe_age"> + <term><varname>vacuum_failsafe_age</varname> (<type>integer</type>) + <indexterm> + <primary><varname>vacuum_failsafe_age</varname> configuration parameter</primary> + </indexterm> + </term> + <listitem> + <para> + Specifies the maximum age (in transactions) that a table's + <structname>pg_class</structname>.<structfield>relfrozenxid</structfield> + field can attain before <command>VACUUM</command> takes + extraordinary measures to avoid system-wide transaction ID + wraparound failure. This is <command>VACUUM</command>'s + strategy of last resort. The failsafe typically triggers + when an autovacuum to prevent transaction ID wraparound has + already been running for some time, though it's possible for + the failsafe to trigger during any <command>VACUUM</command>. + </para> + <para> + When the failsafe is triggered, any cost-based delay that is + in effect will no longer be applied, further non-essential + maintenance tasks (such as index vacuuming) are bypassed, and any + <glossterm linkend="glossary-buffer-access-strategy">Buffer Access Strategy</glossterm> + in use will be disabled resulting in <command>VACUUM</command> being + free to make use of all of + <glossterm linkend="glossary-shared-memory">shared buffers</glossterm>. + </para> + <para> + The default is 1.6 billion transactions. Although users can + set this value anywhere from zero to 2.1 billion, + <command>VACUUM</command> will silently adjust the effective + value to no less than 105% of <xref + linkend="guc-autovacuum-freeze-max-age"/>. + </para> + </listitem> + </varlistentry> + + <varlistentry id="guc-vacuum-multixact-freeze-table-age" xreflabel="vacuum_multixact_freeze_table_age"> + <term><varname>vacuum_multixact_freeze_table_age</varname> (<type>integer</type>) + <indexterm> + <primary><varname>vacuum_multixact_freeze_table_age</varname> configuration parameter</primary> + </indexterm> + </term> + <listitem> + <para> + <command>VACUUM</command> performs an aggressive scan if the table's + <structname>pg_class</structname>.<structfield>relminmxid</structfield> field has reached + the age specified by this setting. An aggressive scan differs from + a regular <command>VACUUM</command> in that it visits every page that might + contain unfrozen XIDs or MXIDs, not just those that might contain dead + tuples. The default is 150 million multixacts. + Although users can set this value anywhere from zero to two billion, + <command>VACUUM</command> will silently limit the effective value to 95% of + <xref linkend="guc-autovacuum-multixact-freeze-max-age"/>, so that a + periodic manual <command>VACUUM</command> has a chance to run before an + anti-wraparound is launched for the table. + For more information see <xref linkend="vacuum-for-multixact-wraparound"/>. + </para> + </listitem> + </varlistentry> + + <varlistentry id="guc-vacuum-multixact-freeze-min-age" xreflabel="vacuum_multixact_freeze_min_age"> + <term><varname>vacuum_multixact_freeze_min_age</varname> (<type>integer</type>) + <indexterm> + <primary><varname>vacuum_multixact_freeze_min_age</varname> configuration parameter</primary> + </indexterm> + </term> + <listitem> + <para> + Specifies the cutoff age (in multixacts) that <command>VACUUM</command> + should use to decide whether to trigger freezing of pages with + an older multixact ID. The default is 5 million multixacts. + Although users can set this value anywhere from zero to one billion, + <command>VACUUM</command> will silently limit the effective value to half + the value of <xref linkend="guc-autovacuum-multixact-freeze-max-age"/>, + so that there is not an unreasonably short time between forced + autovacuums. + For more information see <xref linkend="vacuum-for-multixact-wraparound"/>. + </para> + </listitem> + </varlistentry> + + <varlistentry id="guc-vacuum-multixact-failsafe-age" xreflabel="vacuum_multixact_failsafe_age"> + <term><varname>vacuum_multixact_failsafe_age</varname> (<type>integer</type>) + <indexterm> + <primary><varname>vacuum_multixact_failsafe_age</varname> configuration parameter</primary> + </indexterm> + </term> + <listitem> + <para> + Specifies the maximum age (in multixacts) that a table's + <structname>pg_class</structname>.<structfield>relminmxid</structfield> + field can attain before <command>VACUUM</command> takes + extraordinary measures to avoid system-wide multixact ID + wraparound failure. This is <command>VACUUM</command>'s + strategy of last resort. The failsafe typically triggers when + an autovacuum to prevent transaction ID wraparound has already + been running for some time, though it's possible for the + failsafe to trigger during any <command>VACUUM</command>. + </para> + <para> + When the failsafe is triggered, any cost-based delay that is + in effect will no longer be applied, and further non-essential + maintenance tasks (such as index vacuuming) are bypassed. + </para> + <para> + The default is 1.6 billion multixacts. Although users can set + this value anywhere from zero to 2.1 billion, + <command>VACUUM</command> will silently adjust the effective + value to no less than 105% of <xref + linkend="guc-autovacuum-multixact-freeze-max-age"/>. + </para> + </listitem> + </varlistentry> + + </variablelist> + </sect2> </sect1> <sect1 id="runtime-config-client"> @@ -9592,168 +9773,6 @@ COPY postgres_log FROM '/full/path/to/logfile.csv' WITH csv; </listitem> </varlistentry> - <varlistentry id="guc-vacuum-freeze-table-age" xreflabel="vacuum_freeze_table_age"> - <term><varname>vacuum_freeze_table_age</varname> (<type>integer</type>) - <indexterm> - <primary><varname>vacuum_freeze_table_age</varname> configuration parameter</primary> - </indexterm> - </term> - <listitem> - <para> - <command>VACUUM</command> performs an aggressive scan if the table's - <structname>pg_class</structname>.<structfield>relfrozenxid</structfield> field has reached - the age specified by this setting. An aggressive scan differs from - a regular <command>VACUUM</command> in that it visits every page that might - contain unfrozen XIDs or MXIDs, not just those that might contain dead - tuples. The default is 150 million transactions. Although users can - set this value anywhere from zero to two billion, <command>VACUUM</command> - will silently limit the effective value to 95% of - <xref linkend="guc-autovacuum-freeze-max-age"/>, so that a - periodic manual <command>VACUUM</command> has a chance to run before an - anti-wraparound autovacuum is launched for the table. For more - information see - <xref linkend="vacuum-for-wraparound"/>. - </para> - </listitem> - </varlistentry> - - <varlistentry id="guc-vacuum-freeze-min-age" xreflabel="vacuum_freeze_min_age"> - <term><varname>vacuum_freeze_min_age</varname> (<type>integer</type>) - <indexterm> - <primary><varname>vacuum_freeze_min_age</varname> configuration parameter</primary> - </indexterm> - </term> - <listitem> - <para> - Specifies the cutoff age (in transactions) that - <command>VACUUM</command> should use to decide whether to - trigger freezing of pages that have an older XID. - The default is 50 million transactions. Although - users can set this value anywhere from zero to one billion, - <command>VACUUM</command> will silently limit the effective value to half - the value of <xref linkend="guc-autovacuum-freeze-max-age"/>, so - that there is not an unreasonably short time between forced - autovacuums. For more information see <xref - linkend="vacuum-for-wraparound"/>. - </para> - </listitem> - </varlistentry> - - <varlistentry id="guc-vacuum-failsafe-age" xreflabel="vacuum_failsafe_age"> - <term><varname>vacuum_failsafe_age</varname> (<type>integer</type>) - <indexterm> - <primary><varname>vacuum_failsafe_age</varname> configuration parameter</primary> - </indexterm> - </term> - <listitem> - <para> - Specifies the maximum age (in transactions) that a table's - <structname>pg_class</structname>.<structfield>relfrozenxid</structfield> - field can attain before <command>VACUUM</command> takes - extraordinary measures to avoid system-wide transaction ID - wraparound failure. This is <command>VACUUM</command>'s - strategy of last resort. The failsafe typically triggers - when an autovacuum to prevent transaction ID wraparound has - already been running for some time, though it's possible for - the failsafe to trigger during any <command>VACUUM</command>. - </para> - <para> - When the failsafe is triggered, any cost-based delay that is - in effect will no longer be applied, further non-essential - maintenance tasks (such as index vacuuming) are bypassed, and any - <glossterm linkend="glossary-buffer-access-strategy">Buffer Access Strategy</glossterm> - in use will be disabled resulting in <command>VACUUM</command> being - free to make use of all of - <glossterm linkend="glossary-shared-memory">shared buffers</glossterm>. - </para> - <para> - The default is 1.6 billion transactions. Although users can - set this value anywhere from zero to 2.1 billion, - <command>VACUUM</command> will silently adjust the effective - value to no less than 105% of <xref - linkend="guc-autovacuum-freeze-max-age"/>. - </para> - </listitem> - </varlistentry> - - <varlistentry id="guc-vacuum-multixact-freeze-table-age" xreflabel="vacuum_multixact_freeze_table_age"> - <term><varname>vacuum_multixact_freeze_table_age</varname> (<type>integer</type>) - <indexterm> - <primary><varname>vacuum_multixact_freeze_table_age</varname> configuration parameter</primary> - </indexterm> - </term> - <listitem> - <para> - <command>VACUUM</command> performs an aggressive scan if the table's - <structname>pg_class</structname>.<structfield>relminmxid</structfield> field has reached - the age specified by this setting. An aggressive scan differs from - a regular <command>VACUUM</command> in that it visits every page that might - contain unfrozen XIDs or MXIDs, not just those that might contain dead - tuples. The default is 150 million multixacts. - Although users can set this value anywhere from zero to two billion, - <command>VACUUM</command> will silently limit the effective value to 95% of - <xref linkend="guc-autovacuum-multixact-freeze-max-age"/>, so that a - periodic manual <command>VACUUM</command> has a chance to run before an - anti-wraparound is launched for the table. - For more information see <xref linkend="vacuum-for-multixact-wraparound"/>. - </para> - </listitem> - </varlistentry> - - <varlistentry id="guc-vacuum-multixact-freeze-min-age" xreflabel="vacuum_multixact_freeze_min_age"> - <term><varname>vacuum_multixact_freeze_min_age</varname> (<type>integer</type>) - <indexterm> - <primary><varname>vacuum_multixact_freeze_min_age</varname> configuration parameter</primary> - </indexterm> - </term> - <listitem> - <para> - Specifies the cutoff age (in multixacts) that <command>VACUUM</command> - should use to decide whether to trigger freezing of pages with - an older multixact ID. The default is 5 million multixacts. - Although users can set this value anywhere from zero to one billion, - <command>VACUUM</command> will silently limit the effective value to half - the value of <xref linkend="guc-autovacuum-multixact-freeze-max-age"/>, - so that there is not an unreasonably short time between forced - autovacuums. - For more information see <xref linkend="vacuum-for-multixact-wraparound"/>. - </para> - </listitem> - </varlistentry> - - <varlistentry id="guc-vacuum-multixact-failsafe-age" xreflabel="vacuum_multixact_failsafe_age"> - <term><varname>vacuum_multixact_failsafe_age</varname> (<type>integer</type>) - <indexterm> - <primary><varname>vacuum_multixact_failsafe_age</varname> configuration parameter</primary> - </indexterm> - </term> - <listitem> - <para> - Specifies the maximum age (in multixacts) that a table's - <structname>pg_class</structname>.<structfield>relminmxid</structfield> - field can attain before <command>VACUUM</command> takes - extraordinary measures to avoid system-wide multixact ID - wraparound failure. This is <command>VACUUM</command>'s - strategy of last resort. The failsafe typically triggers when - an autovacuum to prevent transaction ID wraparound has already - been running for some time, though it's possible for the - failsafe to trigger during any <command>VACUUM</command>. - </para> - <para> - When the failsafe is triggered, any cost-based delay that is - in effect will no longer be applied, and further non-essential - maintenance tasks (such as index vacuuming) are bypassed. - </para> - <para> - The default is 1.6 billion multixacts. Although users can set - this value anywhere from zero to 2.1 billion, - <command>VACUUM</command> will silently adjust the effective - value to no less than 105% of <xref - linkend="guc-autovacuum-multixact-freeze-max-age"/>. - </para> - </listitem> - </varlistentry> - <varlistentry id="guc-bytea-output" xreflabel="bytea_output"> <term><varname>bytea_output</varname> (<type>enum</type>) <indexterm> -- 2.34.1
From b544d0de0f98f8ba1ea36c4e50032016d6561844 Mon Sep 17 00:00:00 2001 From: Melanie Plageman <melanieplage...@gmail.com> Date: Wed, 11 Dec 2024 14:13:34 -0500 Subject: [PATCH v5 2/3] Add more general summary to vacuumlazy.c Add more comments at the top of vacuumlazy.c on heap relation vacuuming implementation. Previously vacuumlazy.c only had details related to the dead TID storage added in Postgres 17. This commit adds a more general summary to help future developers understand the heap relation vacuum design and implementation at a high level. Reviewed-by: Robert Haas, Bilal Yavuz Discussion: https://postgr.es/m/flat/CAAKRu_ZF_KCzZuOrPrOqjGVe8iRVWEAJSpzMgRQs%3D5-v84cXUg%40mail.gmail.com --- src/backend/access/heap/vacuumlazy.c | 42 ++++++++++++++++++++++++++++ 1 file changed, 42 insertions(+) diff --git a/src/backend/access/heap/vacuumlazy.c b/src/backend/access/heap/vacuumlazy.c index 09fab08b8e1..c047fb20f7a 100644 --- a/src/backend/access/heap/vacuumlazy.c +++ b/src/backend/access/heap/vacuumlazy.c @@ -3,6 +3,48 @@ * vacuumlazy.c * Concurrent ("lazy") vacuuming. * + * Heap relations are vacuumed in three main phases. In phase I, vacuum scans + * relation pages, pruning and freezing tuples and saving dead tuples' TIDs in + * a TID store. If that TID store fills up or vacuum finishes scanning the + * relation, it progresses to phase II: index vacuuming. Index vacuuming + * deletes the dead index entries referenced in the TID store. In phase III, + * vacuum scans the blocks of the relation indicated by the TIDs in the TID + * store and reaps the dead tuples, freeing that space for future tuples. + * + * If there are no indexes or index scanning is disabled, phase II may be + * skipped. If phase I identified very few dead index entries, vacuum may skip + * phases II and III. If the TID store fills up in phase I, vacuum suspends + * phase I, proceeds to phases II and II and cleans up the dead tuples + * referenced in the current TID store. This empties the TID store and allows + * vacuum to resume phase I. In this sense, the phases are more like states in + * a state machine, but they have been referred to colloquially as phases for + * long enough that it makes sense to refer to them in that way here. + * + * Finally, vacuum may truncate the relation if it has emptied pages at the + * end. After finishing all phases of work, vacuum updates relation statistics + * in pg_class and the cumulative statistics subsystem. + * + * Relation Scanning: + * + * Vacuum scans the heap relation, starting at the beginning and progressing + * to the end, skipping pages as permitted by their visibility status, vacuum + * options, and the eagerness level of the vacuum. + * + * When page skipping is enabled, non-aggressive vacuums may skip scanning + * pages that are marked all-visible in the visibility map. We may choose not + * to skip pages if the range of skippable pages is below + * SKIP_PAGES_THRESHOLD. + * + * Once vacuum has decided to scan a given block, it must read in the block + * and obtain a cleanup lock to prune tuples on the page. A non-aggressive + * vacuum may choose to skip pruning and freezing if it cannot acquire a + * cleanup lock on the buffer right away. + * + * After pruning and freezing, pages that are newly all-visible and all-frozen + * are marked as such in the visibility map. + * + * Dead TID Storage: + * * The major space usage for vacuuming is storage for the dead tuple IDs that * are to be removed from indexes. We want to ensure we can vacuum even the * very largest relations with finite memory space usage. To do that, we set -- 2.34.1