Nikolay Samokhvalov <n...@postgres.ai> 2:47 PM (0 minutes ago) to pglsql-hackers I just learned that vacuumdb --analyze-only doesn't update stats for the partitioned table itself, taking care only about individual partitions:
(DDL doesn't matter here) # vacuumdb --analyze-only -U postgres test --verbose ... INFO: analyzing "public.measurement_2023_01" INFO: "measurement_2023_01": scanned 6370 of 6370 pages, containing 1000000 live rows and 0 dead rows; 30000 rows in sample, 1000000 estimated total rows INFO: "measurement_2023_02": scanned 6257 of 6257 pages, containing 982279 live rows and 0 dead rows; 30000 rows in sample, 982279 estimated total rows INFO: analyzing "public.measurement_2023_03" INFO: "measurement_2023_03": scanned 6483 of 6483 pages, containing 1017721 live rows and 0 dead rows; 30000 rows in sample, 1017721 estimated total rows ... test=# select starelid::regclass, count(*) from pg_statistic where starelid::regclass::text ~ 'measurement' group by 1 order by 1; starelid | count ---------------------+------- measurement_2023_01 | 4 measurement_2023_02 | 4 measurement_2023_03 | 4 (3 rows) While for the single-threaded SQL-level ANALYZE: test=# analyze verbose measurement; ... test=# select starelid::regclass, count(*) from pg_statistic where starelid::regclass::text ~ 'measurement' group by 1 order by 1; starelid | count ---------------------+------- measurement | 4 measurement_2023_01 | 4 measurement_2023_02 | 4 measurement_2023_03 | 4 (4 rows) This means that if, after running pg_upgrade, we use vacuumdb to update stats faster, some stats may be missing, potentially leading to suboptimal performance. Additionally, it doesn't help that pg_stat_all_tables doesn't show counters/timestamps for partitioned table, even after SQL-level ANALYZE: test=# select relname, analyze_count, autoanalyze_count, last_analyze, last_autoanalyze from pg_stat_user_tables where relname ~ 'measurement'; relname | analyze_count | autoanalyze_count | last_analyze | last_autoanalyze ---------------------+---------------+-------------------+-------------------------------+------------------ measurement_2023_01 | 2 | 0 | 2024-10-24 21:25:47.979958+00 | measurement_2023_02 | 2 | 0 | 2024-10-24 21:25:48.070355+00 | measurement_2023_03 | 2 | 0 | 2024-10-24 21:25:48.154613+00 | (3 rows) This is also discussed in https://www.postgresql.org/message-id/flat/CADofcAXVbD0yGp_EaC9chmzsOoSai3jcfBCnyva3j0RRdRvMVA%40mail.gmail.com I propose considering 3 fixes: 1) vacuumdb --analyze / --analyze-only to update stats for the partitioned table, so people using pg_upgrade are not in trouble 2) present the ANALYZE metadata for partitioned tables in pg_stat_all_tables 3) for old versions, either backpatch with fix (1) OR just add to the docs (and maybe to the final words pg_upgrade prints), suggesting something like this in addition to vacuumdb analyze-only: -- psql snippet select format( 'analyze verbose %I.%I;', relnamespace::oid::regnamespace, oid::regclass ) as vacuum_command from pg_class where relkind = 'p' \gexec Additionally, I do like the idea of ANALYZE ONLY from the -general discussion above (though, there might be confusion with logic of --analyze and --analyze-only in vacuumdb). Does it make sense? Nik