Hello.
>- Which version of postgres is this? Newer versions avoid scanning
> unchanged parts of the heap even for freezing (9.6+, with additional
> smaller improvements in 11).
Oh, totally forgot about version and settings...
server_version 10.9 (Ubuntu 10.9-103)
So, "don't vacuum all-frozen pages" included.
> - have you increased the vacuum cost limits? Before PG 12 they're so low
> they're entirely unsuitable for larger databases, and even in 12 you
> should likely increase them for a multi-TB database
Current settings are:
autovacuum_max_workers 8
autovacuum_vacuum_cost_delay 5ms
autovacuum_vacuum_cost_limit 400
autovacuum_work_mem -1
vacuum_cost_page_dirty 40
vacuum_cost_page_hit 1
vacuum_cost_page_miss 10
"autovacuum_max_workers" set to 8 because server needs to process a lot of
changing relations.
Settings were more aggressive previously (autovacuum_vacuum_cost_limit was
2800) but it leads to very high IO load causing issues with application
performance and stability (even on SSD).
"vacuum_cost_page_dirty" was set to 40 few month ago. High IO write peaks
were causing application requests to stuck into WALWriteLock.
After some investigations we found it was caused by WAL-logging peaks.
Such WAL-peaks are mostly consist of such records:
Type N(%)
Record size (%) FPI size (%)
Combined size (%)
------
Heap2/CLEAN 10520 ( 0.86)
623660 ( 0.21) 5317532 ( 0.53) 5941192
( 0.46)
Heap2/FREEZE_PAGE 113419 ( 9.29)
6673877 ( 2.26) 635354048 ( 63.12) 642027925 (
49.31)
another example:
Heap2/CLEAN 196707 ( 6.96)
12116527 ( 1.56) 292317231 ( 37.77) 304433758 (
19.64)
Heap2/FREEZE_PAGE 1819 ( 0.06)
104012 ( 0.01) 13324269 ( 1.72) 13428281 (
0.87)
Thanks,
Michail.