Hi all,
Following up on the recent discussion around ANALYZE
(MISSING_STATS_ONLY):[1]
I would like to start a separate discussion about a potential ANALYZE
(MODIFIED_STATS) option.
The idea is to allow manual ANALYZE to reuse the same threshold logic that
autoanalyze uses, so that when a user explicitly runs ANALYZE, only
relations that have crossed the modification threshold are processed.
Conceptually, this would use the existing formula:
analyze threshold = analyze_base_threshold
+ analyze_scale_factor * reltuples
and compare it against n_mod_since_analyze, similar to how autovacuum
decides when to trigger analyze.
The goal is not to replace autoanalyze, but to expose its decision model at
SQL level for deterministic, user-controlled execution. For example:
-Running maintenance in scripted environments
-Triggering analysis immediately after batch data loads
-Avoiding unnecessary work when running manual ANALYZE across many relations
- Environments where autovacuum is tuned conservatively or partially
disabled
Autoanalyze runs opportunistically in the background. This proposal would
allow a user to apply the same threshold logic explicitly and immediately.
A possible usage would look like: ANALYZE (MODIFIED_STATS);
I understand there is conceptual overlap with autovacuum, so I would
especially appreciate feedback on:
-Whether exposing the threshold logic at SQL level makes sense
architecturally
-Whether this should remain an explicit opt-in option
-Naming (e.g., MODIFIED_STATS vs SKIP_UNMODIFIED or something clearer)
-Whether the thresholds should reuse existing GUCs or accept per-command
overrides
I intentionally kept this separate from MISSING_STATS_ONLY, since that
option answers a different question (“are stats missing?”) while this one
would answer (“have enough rows changed to justify re-analysis?”).
I would greatly appreciate thoughts before working on a prototype patch.
Thanks again for all the feedback so far — it has been very helpful
,expecting here the same.
Regards,
Vasuki M
C-DAC,Chennai
[1][
https://www.postgresql.org/message-id/cae2r8h61ztt4ek3jmlkdpmr7alq0ue9wswwjrfhbxm0wdoj...@mail.gmail.com
]