Unfortunately it's still 9.6. Upgrade to latest 13 is planned for this year.
pt., 14 maj 2021 o 12:08 Imre Samu <pella.s...@gmail.com> napisał(a): > > Apart from the above hack of filtering out live tuples to a separate > table is there anything I could do? > > This is the latest PG13.3 version? > > IMHO: If not, maybe worth updating to the latest patch release, as soon > as possible > > https://www.postgresql.org/docs/release/13.3/ > Release date: 2021-05-13 > *"Disable the vacuum_cleanup_index_scale_factor parameter and storage > option (Peter Geoghegan)* > *The notion of tracking “stale” index statistics proved to interact badly > with the autovacuum_vacuum_insert_threshold parameter, resulting in > unnecessary full-index scans and consequent degradation of autovacuum > performance. The latter mechanism seems superior, so remove the > stale-statistics logic. The control parameter for that, > vacuum_cleanup_index_scale_factor, will be removed entirely in v14. In v13, > it remains present to avoid breaking existing configuration files, but it > no longer does anything."* > > best, > Imre > > > Marcin Gozdalik <goz...@gmail.com> ezt írta (időpont: 2021. máj. 14., P, > 13:20): > >> Hi >> >> I am trying to use `pgmetrics` on a big (10TB+), busy (1GB/s RW) >> database. It takes around 5 minutes for pgmetrics to run. I traced the >> problem to the "bloat query" (version of >> https://wiki.postgresql.org/wiki/Show_database_bloat) spinning in CPU, >> doing no I/O. >> >> I have traced the problem to the bloated `pg_class` (the irony: >> `pgmetrics` does not collect bloat on `pg_catalog`): >> `vacuum (full, analyze, verbose) pg_class;` >> ``` >> INFO: vacuuming "pg_catalog.pg_class" >> INFO: "pg_class": found 1 removable, 7430805 nonremovable row versions >> in 158870 pages >> DETAIL: 7429943 dead row versions cannot be removed yet. >> CPU 1.36s/6.40u sec elapsed 9.85 sec. >> INFO: analyzing "pg_catalog.pg_class" >> INFO: "pg_class": scanned 60000 of 158869 pages, containing 295 live >> rows and 2806547 dead rows; 295 rows in sample, 781 estimated total rows >> VACUUM >> ``` >> >> `pg_class` has so many dead rows because the workload is temp-table heavy >> (creating/destroying 1M+ temporary tables per day) and has long running >> analytics queries running for 24h+. >> >> PG query planner assumes that index scan on `pg_class` will be very quick >> and plans Nested loop with Index scan. However, the index scan has 7M dead >> tuples to filter out and the query takes more than 200 seconds ( >> https://explain.depesz.com/s/bw2G). >> >> If I create a temp table from `pg_class` to contain only the live tuples: >> ``` >> CREATE TEMPORARY TABLE pg_class_alive AS SELECT oid,* from pg_class; >> CREATE UNIQUE INDEX pg_class_alive_oid_index ON pg_class_alive(oid); >> CREATE UNIQUE INDEX pg_class_alive_relname_nsp_index ON >> pg_class_alive(relname, relnamespace); >> CREATE INDEX pg_class_tblspc_relfilenode_index ON >> pg_class_alive(reltablespace, relfilenode); >> ANALYZE pg_class_alive; >> ``` >> >> and run the bloat query on `pg_class_alive` instead of `pg_class`: >> ``` >> SELECT >> nn.nspname AS schemaname, >> cc.relname AS tablename, >> COALESCE(cc.reltuples,0) AS reltuples, >> COALESCE(cc.relpages,0) AS relpages, >> COALESCE(CEIL((cc.reltuples*((datahdr+8- >> (CASE WHEN datahdr%8=0 THEN 8 ELSE datahdr%8 >> END))+nullhdr2+4))/(8192-20::float)),0) AS otta >> FROM >> pg_class_alive cc >> JOIN pg_namespace nn ON cc.relnamespace = nn.oid AND nn.nspname <> >> 'information_schema' >> LEFT JOIN >> ( >> SELECT >> foo.nspname,foo.relname, >> (datawidth+32)::numeric AS datahdr, >> (maxfracsum*(nullhdr+8-(case when nullhdr%8=0 THEN 8 ELSE nullhdr%8 >> END))) AS nullhdr2 >> FROM ( >> SELECT >> ns.nspname, tbl.relname, >> SUM((1-coalesce(null_frac,0))*coalesce(avg_width, 2048)) AS >> datawidth, >> MAX(coalesce(null_frac,0)) AS maxfracsum, >> 23+( >> SELECT 1+count(*)/8 >> FROM pg_stats s2 >> WHERE null_frac<>0 AND s2.schemaname = ns.nspname AND >> s2.tablename = tbl.relname >> ) AS nullhdr >> FROM pg_attribute att >> JOIN pg_class_alive tbl ON att.attrelid = tbl.oid >> JOIN pg_namespace ns ON ns.oid = tbl.relnamespace >> LEFT JOIN pg_stats s ON s.schemaname=ns.nspname >> AND s.tablename = tbl.relname >> AND s.inherited=false >> AND s.attname=att.attname >> WHERE att.attnum > 0 AND tbl.relkind='r' >> GROUP BY 1,2 >> ) AS foo >> ) AS rs >> ON cc.relname = rs.relname AND nn.nspname = rs.nspname >> LEFT JOIN pg_index i ON indrelid = cc.oid >> LEFT JOIN pg_class_alive c2 ON c2.oid = i.indexrelid >> ``` >> >> it runs in 10s, 20x faster (https://explain.depesz.com/s/K4SH) >> >> The rabbit hole probably goes deeper (e.g. should do the same for >> pg_statistic and pg_attribute and create a new pg_stats view). >> >> I am not able (at least not quickly) change the amount of temporary >> tables created or make the analytics queries finish quicker. Apart from the >> above hack of filtering out live tuples to a separate table is there >> anything I could do? >> >> Thank you, >> Marcin Gozdalik >> >> -- >> Marcin Gozdalik >> > -- Marcin Gozdalik