回复：autovacuum cost delay improvement thread

段坤仁(刻韧) Thu, 13 Nov 2025 18:19:43 -0800
------------------------------------------------------------------
发件人：段坤仁(刻韧) <[email protected]>
发送时间：2025年11月14日(周五) 10:17
收件人："pgsql-hackers"<[email protected]>
主　题：autovacuum cost delay improvement thread
Hi all, Recently I noticed an interesting thread about table scheduling in 
autovacuum in [0], which contained some constructive discussions. After testing
nathan's patch, I attempted to make improvements on some issues mentioned in 
the discussion, mainly related to cost delay. Cost delay was originally 
introduced to prevent autovacuum workers from consuming excessive resources and
impacting normal workloads. However, this value tends to be too low when system
load is intense and too high when system is idle. The initial idea is 
straightforward: we can automatically adjust cost delay based on system
load. Currently, I am primarily considering IO resources (Maybe CPU and other
resources in the future) and have implemented the following mechanisms:
1.IO Load Monitoring: Tracks IO wait events across all backend processes with 
weighted contributions (IO-sensitive backends contribute more weight than 
others).
2.Dynamic Mode Selection: Introduce three modes based on IO wait thresholds:
FAST mode: When IO load is low (few of processes in IO wait)
SLOW mode: When IO load is high (lots of processes in IO wait)
NORMAL mode: Otherwise
3.Adaptive Delay Adjustment: Modifies vacuum delay at vacuum_delay_point(), 
Normal
Mode use guc configured delay, skip delays in FAST mode and add extra delays in 
SLOW mode based on detected IO contention.
I tried using Sami's scripts in[0] to perform some tests. I added a debug GUC 
parameter 'debug_autovacuum_adaptive_cost_delay'for testing, with the results
displayed in result-analysis.txt. My PostgreSQL configuration is as follows:
、、、
max_connections=1000
autovacuum_naptime = '10s'
shared_buffers = '4GB'
autovacuum_max_workers = 6
autovacuum_vacuum_cost_delay =2ms
autovacuum_vacuum_cost_limit =200
、、、
##################################################
## ADAPTIVE OFF
##################################################
 relname | reltuples | n_dead_tup | avg_av_time | avg_aa_time 
------------------+-----------+------------+-------------+-------------
 table_batch_2 | 38100700 | 0 | 63426.75 | 2788.63
 table_batch_3 | 34528300 | 0 | 43569.38 | 2752.13
 table_batch_1 | 21774900 | 0 | 33058.09 | 2675.13
 table_13_4096000 | 4094220 | 1663211 | 0.00 | 12047.00
 table_12_2048000 | 2045990 | 1202655 | 688704.00 | 10698.33
 table_11_1024000 | 1023520 | 446236 | 337606.00 | 17622.00
 table_10_512000 | 510631 | 523053 | 138559.00 | 13200.89
 table_9_256000 | 253966 | 338711 | 54128.55 | 9159.77
 table_8_128000 | 127072 | 304428 | 17124.50 | 3508.30
 table_7_64000 | 63934 | 384090 | 4450.55 | 834.38
 table_6_32000 | 31999 | 397000 | 1690.46 | 453.33
 table_5_16000 | 16000 | 403389 | 1030.89 | 282.55
 table_4_8000 | 8000 | 408596 | 846.87 | 221.53
 table_3_4000 | 4000 | 413417 | 881.73 | 224.77
 table_16_4000 | 3998 | 424019 | 956.04 | 199.35
 table_2_2000 | 1996 | 433390 | 829.55 | 207.11
 table_15_2000 | 1988 | 429912 | 868.61 | 216.27
 table_14_1000 | 991 | 443569 | 856.38 | 149.44
 table_1_1000 | 986 | 438926 | 795.70 | 161.82
############################################
## ADAPTIVE ON
############################################
 relname | reltuples | n_dead_tup | avg_av_time | avg_aa_time 
------------------+-----------+------------+-------------+-------------
 table_batch_3 | 57988400 | 0 | 711.95 | 212.73
 table_batch_2 | 51700100 | 0 | 549.80 | 203.18
 table_batch_1 | 49688200 | 0 | 574.58 | 206.67
 table_13_4096000 | 4096250 | 22439 | 9937.00 | 400.60
 table_12_2048000 | 2048270 | 2506 | 4867.50 | 375.00
 table_11_1024000 | 1023880 | 211596 | 2351.57 | 389.25
 table_10_512000 | 512056 | 23575 | 1024.93 | 352.03
 table_9_256000 | 256011 | 42427 | 429.28 | 309.75
 table_8_128000 | 127992 | 11739 | 167.56 | 263.62
 table_7_64000 | 63998 | 15171 | 69.52 | 221.04
 table_6_32000 | 31998 | 5787 | 32.94 | 192.52
 table_5_16000 | 16000 | 6769 | 18.11 | 92.08
 table_4_8000 | 8000 | 6811 | 10.40 | 41.37
 table_3_4000 | 4000 | 7058 | 6.40 | 18.58
 table_16_4000 | 4000 | 6786 | 6.32 | 18.55
 table_15_2000 | 2000 | 6783 | 4.45 | 8.93
 table_2_2000 | 2000 | 6936 | 4.58 | 8.89
 table_1_1000 | 1000 | 6798 | 3.66 | 5.50
 table_14_1000 | 1000 | 6700 | 3.66 | 5.50
Result shows that av_time decreased significantly, and there is no noticeable
degradation in system performance.
So far, this is just an exploratory patch with some areas that can be improved.
Any constructive feedback and suggestions are welcome.
Thank you all
--
Duan
Alibaba Cloud Computing
[0] https://www.postgresql.org/message-id/flat/aOaAuXREwnPZVISO%40nathan 
<https://www.postgresql.org/message-id/flat/aOaAuXREwnPZVISO%40nathan >
v1-0001-autovacuum-cost-based-delay-improvements.patch
Description: Binary data
result-analysis.txt
Description: Binary data
回复：autovacuum cost delay improvement thread

Reply via email to