PostgreSQL 9.6.21 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-44), 64-bit
pt., 14 maj 2021 o 15:45 Imre Samu <pella.s...@gmail.com> napisał(a): > > Unfortunately it's still 9.6. > > And what is your "*version()*"? > > > for example: > postgres=# select version(); > version > > > --------------------------------------------------------------------------------------------------------------------------------- > PostgreSQL 9.6.22 on x86_64-pc-linux-gnu (Debian 9.6.22-1.pgdg110+1), > compiled by gcc (Debian 10.2.1-6) 10.2.1 20210110, 64-bit > (1 row) > > Imre > > > Marcin Gozdalik <goz...@gmail.com> ezt írta (időpont: 2021. máj. 14., P, > 14:11): > >> Unfortunately it's still 9.6. Upgrade to latest 13 is planned for this >> year. >> >> pt., 14 maj 2021 o 12:08 Imre Samu <pella.s...@gmail.com> napisał(a): >> >>> > Apart from the above hack of filtering out live tuples to a separate >>> table is there anything I could do? >>> >>> This is the latest PG13.3 version? >>> >>> IMHO: If not, maybe worth updating to the latest patch release, as >>> soon as possible >>> >>> https://www.postgresql.org/docs/release/13.3/ >>> Release date: 2021-05-13 >>> *"Disable the vacuum_cleanup_index_scale_factor parameter and storage >>> option (Peter Geoghegan)* >>> *The notion of tracking “stale” index statistics proved to interact >>> badly with the autovacuum_vacuum_insert_threshold parameter, resulting in >>> unnecessary full-index scans and consequent degradation of autovacuum >>> performance. The latter mechanism seems superior, so remove the >>> stale-statistics logic. The control parameter for that, >>> vacuum_cleanup_index_scale_factor, will be removed entirely in v14. In v13, >>> it remains present to avoid breaking existing configuration files, but it >>> no longer does anything."* >>> >>> best, >>> Imre >>> >>> >>> Marcin Gozdalik <goz...@gmail.com> ezt írta (időpont: 2021. máj. 14., >>> P, 13:20): >>> >>>> Hi >>>> >>>> I am trying to use `pgmetrics` on a big (10TB+), busy (1GB/s RW) >>>> database. It takes around 5 minutes for pgmetrics to run. I traced the >>>> problem to the "bloat query" (version of >>>> https://wiki.postgresql.org/wiki/Show_database_bloat) spinning in CPU, >>>> doing no I/O. >>>> >>>> I have traced the problem to the bloated `pg_class` (the irony: >>>> `pgmetrics` does not collect bloat on `pg_catalog`): >>>> `vacuum (full, analyze, verbose) pg_class;` >>>> ``` >>>> INFO: vacuuming "pg_catalog.pg_class" >>>> INFO: "pg_class": found 1 removable, 7430805 nonremovable row versions >>>> in 158870 pages >>>> DETAIL: 7429943 dead row versions cannot be removed yet. >>>> CPU 1.36s/6.40u sec elapsed 9.85 sec. >>>> INFO: analyzing "pg_catalog.pg_class" >>>> INFO: "pg_class": scanned 60000 of 158869 pages, containing 295 live >>>> rows and 2806547 dead rows; 295 rows in sample, 781 estimated total rows >>>> VACUUM >>>> ``` >>>> >>>> `pg_class` has so many dead rows because the workload is temp-table >>>> heavy (creating/destroying 1M+ temporary tables per day) and has long >>>> running analytics queries running for 24h+. >>>> >>>> PG query planner assumes that index scan on `pg_class` will be very >>>> quick and plans Nested loop with Index scan. However, the index scan has 7M >>>> dead tuples to filter out and the query takes more than 200 seconds ( >>>> https://explain.depesz.com/s/bw2G). >>>> >>>> If I create a temp table from `pg_class` to contain only the live >>>> tuples: >>>> ``` >>>> CREATE TEMPORARY TABLE pg_class_alive AS SELECT oid,* from pg_class; >>>> CREATE UNIQUE INDEX pg_class_alive_oid_index ON pg_class_alive(oid); >>>> CREATE UNIQUE INDEX pg_class_alive_relname_nsp_index ON >>>> pg_class_alive(relname, relnamespace); >>>> CREATE INDEX pg_class_tblspc_relfilenode_index ON >>>> pg_class_alive(reltablespace, relfilenode); >>>> ANALYZE pg_class_alive; >>>> ``` >>>> >>>> and run the bloat query on `pg_class_alive` instead of `pg_class`: >>>> ``` >>>> SELECT >>>> nn.nspname AS schemaname, >>>> cc.relname AS tablename, >>>> COALESCE(cc.reltuples,0) AS reltuples, >>>> COALESCE(cc.relpages,0) AS relpages, >>>> COALESCE(CEIL((cc.reltuples*((datahdr+8- >>>> (CASE WHEN datahdr%8=0 THEN 8 ELSE datahdr%8 >>>> END))+nullhdr2+4))/(8192-20::float)),0) AS otta >>>> FROM >>>> pg_class_alive cc >>>> JOIN pg_namespace nn ON cc.relnamespace = nn.oid AND nn.nspname <> >>>> 'information_schema' >>>> LEFT JOIN >>>> ( >>>> SELECT >>>> foo.nspname,foo.relname, >>>> (datawidth+32)::numeric AS datahdr, >>>> (maxfracsum*(nullhdr+8-(case when nullhdr%8=0 THEN 8 ELSE >>>> nullhdr%8 END))) AS nullhdr2 >>>> FROM ( >>>> SELECT >>>> ns.nspname, tbl.relname, >>>> SUM((1-coalesce(null_frac,0))*coalesce(avg_width, 2048)) AS >>>> datawidth, >>>> MAX(coalesce(null_frac,0)) AS maxfracsum, >>>> 23+( >>>> SELECT 1+count(*)/8 >>>> FROM pg_stats s2 >>>> WHERE null_frac<>0 AND s2.schemaname = ns.nspname AND >>>> s2.tablename = tbl.relname >>>> ) AS nullhdr >>>> FROM pg_attribute att >>>> JOIN pg_class_alive tbl ON att.attrelid = tbl.oid >>>> JOIN pg_namespace ns ON ns.oid = tbl.relnamespace >>>> LEFT JOIN pg_stats s ON s.schemaname=ns.nspname >>>> AND s.tablename = tbl.relname >>>> AND s.inherited=false >>>> AND s.attname=att.attname >>>> WHERE att.attnum > 0 AND tbl.relkind='r' >>>> GROUP BY 1,2 >>>> ) AS foo >>>> ) AS rs >>>> ON cc.relname = rs.relname AND nn.nspname = rs.nspname >>>> LEFT JOIN pg_index i ON indrelid = cc.oid >>>> LEFT JOIN pg_class_alive c2 ON c2.oid = i.indexrelid >>>> ``` >>>> >>>> it runs in 10s, 20x faster (https://explain.depesz.com/s/K4SH) >>>> >>>> The rabbit hole probably goes deeper (e.g. should do the same for >>>> pg_statistic and pg_attribute and create a new pg_stats view). >>>> >>>> I am not able (at least not quickly) change the amount of temporary >>>> tables created or make the analytics queries finish quicker. Apart from the >>>> above hack of filtering out live tuples to a separate table is there >>>> anything I could do? >>>> >>>> Thank you, >>>> Marcin Gozdalik >>>> >>>> -- >>>> Marcin Gozdalik >>>> >>> >> >> -- >> Marcin Gozdalik >> > -- Marcin Gozdalik