Hi On Mon, Mar 30, 2026 at 1:44 AM Daniil Davydov <[email protected]> wrote:
> Hi, > > On Mon, Mar 30, 2026 at 7:17 AM SATYANARAYANA NARLAPURAM > <[email protected]> wrote: > > > > Thank you for working on this, very useful feature. Sharing a few > thoughts: > > > > 1. Shouldn't we also cap by max_parallel_workers to avoid wasting DSM > resources in parallel_vacuum_compute_workers? > > Actually, autovacuum_max_parallel_workers is already limited by > max_parallel_workers. It is not clear for me why we allow setting this GUC > higher than max_parallel_workers, but if this happens, I think it is a > user's > misconfiguration. > > > 2. Is it intentional that other autovacuum workers not yield cost limits > to the parallel auto vacuum workers? Cost limits are distributed first > equally to the autovacuum workers. > > and then they share that. Therefore, parallel workers will be heavily > throttled. IIUC, this problem doesn't exist with manual vacuum. > > If we don't fix this, at least we should document this. > > Parallel a/v workers inherit cost based parameters (including the > vacuum_cost_limit) from the leader worker. Do you mean that this can be too > low value for parallel operation? If so, user can manually increase the > vacuum_cost_limit reloption for those tables, where parallel a/v sleeps too > much (due to cost delay). > > BTW, describing the cost limit propagation to the parallel a/v workers is > worth mentioning in the documentation. I'll add it in the next patch > version. > > > 3. Additionally, is there a point where, based on the cost limits, > launching additional workers becomes counterproductive compared to running > fewer workers and preventing it? > > I don't think that we can possibly find a universal limit that will be > appropriate for all possible configurations. By now we are using a pretty > simple formula for parallel degree calculation. Since user have several > ways > to affect this formula, I guess that there will be no problems with it > (except > my concerns about opt-out style). > > > 4. Would it make sense to add a table level override to disable > parallelism or set parallel worker count? > > We already have the "autovacuum_parallel_workers" reloption that is used as > an additional limit for the number of parallel workers. In particular, this > reloption can be used to disable parallelism at all. > > > > > I ran some perf tests to show the improvements with parallel vacuum and > shared below. > > Thank you very much! > > > Observations: > > > > 1. Parallel autovacuum provides consistent speedup. With cost_limit=200 > and > > 7 workers, vacuum completes 1.41x faster (71s -> 50s). With > cost_limit=60, > > the speedup is 1.25x (194s -> 154s). > > 2. I see the benefit comes from parallelizing index vacuum. With 8 > indexes totaling > > ~530 MB, parallel workers scan indexes concurrently instead of the > leader > > scanning them one by one. The leader's CPU user time drops from ~3s to > > ~0.8s as index work is offloaded > > > > 1.41 speedup with 7 parallel workers may not seem like a great win, but it > is > a whole time of autovacuum operation (not only index bulkdel/cleanup) with > pretty small indexes. > > May I ask you to run the same test with a higher table's size (several > dozen > gigabytes)? I think the results will be more "expressive". > I ran it with a Billion rows in a table with 8 indexes. The improvement with 7 workers is 1.8x. Please note that there is a fixed overhead in other vacuum steps, for example heap scan. In the environments where cost-based delay is used (the default), benefits will be modest unless vacuum_cost_delay is set to sufficiently large value. *Hardware:* CPU: Intel Xeon Platinum 8573C, 1 socket × 8 cores × 2 threads = 16 vCPUs RAM: 128 GB (131,900 MB) Swap: None *Workload Description* *Table Schema:* CREATE TABLE avtest ( id bigint PRIMARY KEY, col1 int, -- random()*1e9 col2 int, -- random()*1e9 col3 int, -- random()*1e9 col4 int, -- random()*1e9 col5 int, -- random()*1e9 col6 text, -- 'text_' || random()*1e6 (short text ~10 chars) col7 timestamp, -- now() - random()*365 days padding text -- repeat('x', 50) ) WITH (fillfactor = 90); *Indexes (8 total):* avtest_pkey — btree on (id) bigint idx_av_col1 — btree on (col1) int idx_av_col2 — btree on (col2) int idx_av_col3 — btree on (col3) int idx_av_col4 — btree on (col4) int idx_av_col5 — btree on (col5) int idx_av_col6 — btree on (col6) text idx_av_col7 — btree on (col7) timestamp Dead Tuple Generation: DELETE FROM avtest WHERE id % 5 IN (1, 2); This deletes exactly 40% of rows, uniformly distributed across all pages. Vacuum Trigger: Autovacuum is triggered naturally by lowering the threshold to 0 and setting scale_factor to a value that causes immediate launch after the DELETE. Worker Configurations Tested: 0 workers — leader-only vacuum (baseline, no parallelism) 2 workers — leader + 2 parallel workers (3 processes total) 4 workers — leader + 4 parallel workers (5 processes total) 7 workers — leader + 7 parallel workers (8 processes total, 1 per index) Dataset: Rows: 1,000,000,000 Heap size: 139 GB Total size: 279 GB (heap + 8 indexes) Dead tuples: 400,000,000 (40%) Index Sizes: avtest_pkey 21 GB (bigint) idx_av_col7 21 GB (timestamp) idx_av_col1 18 GB (int) idx_av_col2 18 GB (int) idx_av_col3 18 GB (int) idx_av_col4 18 GB (int) idx_av_col5 18 GB (int) idx_av_col6 7 GB (text — shorter keys, smaller index) Total indexes: 139 GB Server Settings: shared_buffers = 96GB maintenance_work_mem = 1GB max_wal_size = 100GB checkpoint_timeout = 1h autovacuum_vacuum_cost_delay = 0ms (NO throttling) autovacuum_vacuum_cost_limit = 1000 *Summary:* Workers Avg(s) Min(s) Max(s) Speedup Time Saved ------- ------ ------ ------ ------- ---------- 0 1645.93 1645.01 1646.84 1.00x — 2 1276.35 1275.64 1277.05 1.29x 369.58s (6.2 min) 4 1052.62 1048.92 1056.32 1.56x 593.31s (9.9 min) 7 892.23 886.59 897.86 1.84x 753.70s (12.6 min) Thanks, Satya
