>  Apart from the above hack of filtering out live tuples to a separate
table is there anything I could do?

This is the latest PG13.3 version?

IMHO:  If not,  maybe worth updating to the latest patch release, as soon
as possible

https://www.postgresql.org/docs/release/13.3/
Release date: 2021-05-13
*"Disable the vacuum_cleanup_index_scale_factor parameter and storage
option (Peter Geoghegan)*
*The notion of tracking “stale” index statistics proved to interact badly
with the autovacuum_vacuum_insert_threshold parameter, resulting in
unnecessary full-index scans and consequent degradation of autovacuum
performance. The latter mechanism seems superior, so remove the
stale-statistics logic. The control parameter for that,
vacuum_cleanup_index_scale_factor, will be removed entirely in v14. In v13,
it remains present to avoid breaking existing configuration files, but it
no longer does anything."*

best,
 Imre


Marcin Gozdalik <goz...@gmail.com> ezt írta (időpont: 2021. máj. 14., P,
13:20):

> Hi
>
> I am trying to use `pgmetrics` on a big (10TB+), busy (1GB/s RW) database.
> It takes around 5 minutes for pgmetrics to run. I traced the problem to the
> "bloat query" (version of
> https://wiki.postgresql.org/wiki/Show_database_bloat) spinning in CPU,
> doing no I/O.
>
> I have traced the problem to the bloated `pg_class` (the irony:
> `pgmetrics` does not collect bloat on `pg_catalog`):
> `vacuum (full, analyze, verbose) pg_class;`
> ```
> INFO:  vacuuming "pg_catalog.pg_class"
> INFO:  "pg_class": found 1 removable, 7430805 nonremovable row versions in
> 158870 pages
> DETAIL:  7429943 dead row versions cannot be removed yet.
> CPU 1.36s/6.40u sec elapsed 9.85 sec.
> INFO:  analyzing "pg_catalog.pg_class"
> INFO:  "pg_class": scanned 60000 of 158869 pages, containing 295 live rows
> and 2806547 dead rows; 295 rows in sample, 781 estimated total rows
> VACUUM
> ```
>
> `pg_class` has so many dead rows because the workload is temp-table heavy
> (creating/destroying 1M+ temporary tables per day) and has long running
> analytics queries running for 24h+.
>
> PG query planner assumes that index scan on `pg_class` will be very quick
> and plans Nested loop with Index scan. However, the index scan has 7M dead
> tuples to filter out and the query takes more than 200 seconds (
> https://explain.depesz.com/s/bw2G).
>
> If I create a temp table from `pg_class` to contain only the live tuples:
> ```
> CREATE TEMPORARY TABLE pg_class_alive AS SELECT oid,* from pg_class;
> CREATE UNIQUE INDEX pg_class_alive_oid_index ON pg_class_alive(oid);
> CREATE UNIQUE INDEX pg_class_alive_relname_nsp_index ON
> pg_class_alive(relname, relnamespace);
> CREATE INDEX pg_class_tblspc_relfilenode_index ON
> pg_class_alive(reltablespace, relfilenode);
> ANALYZE pg_class_alive;
> ```
>
> and run the bloat query on `pg_class_alive` instead of `pg_class`:
> ```
> SELECT
>    nn.nspname AS schemaname,
>    cc.relname AS tablename,
>    COALESCE(cc.reltuples,0) AS reltuples,
>    COALESCE(cc.relpages,0) AS relpages,
>    COALESCE(CEIL((cc.reltuples*((datahdr+8-
>      (CASE WHEN datahdr%8=0 THEN 8 ELSE datahdr%8
> END))+nullhdr2+4))/(8192-20::float)),0) AS otta
>  FROM
>     pg_class_alive cc
>  JOIN pg_namespace nn ON cc.relnamespace = nn.oid AND nn.nspname <>
> 'information_schema'
>  LEFT JOIN
>  (
>    SELECT
>      foo.nspname,foo.relname,
>      (datawidth+32)::numeric AS datahdr,
>      (maxfracsum*(nullhdr+8-(case when nullhdr%8=0 THEN 8 ELSE nullhdr%8
> END))) AS nullhdr2
>    FROM (
>      SELECT
>        ns.nspname, tbl.relname,
>        SUM((1-coalesce(null_frac,0))*coalesce(avg_width, 2048)) AS
> datawidth,
>        MAX(coalesce(null_frac,0)) AS maxfracsum,
>        23+(
>          SELECT 1+count(*)/8
>          FROM pg_stats s2
>          WHERE null_frac<>0 AND s2.schemaname = ns.nspname AND
> s2.tablename = tbl.relname
>        ) AS nullhdr
>      FROM pg_attribute att
>      JOIN pg_class_alive tbl ON att.attrelid = tbl.oid
>      JOIN pg_namespace ns ON ns.oid = tbl.relnamespace
>      LEFT JOIN pg_stats s ON s.schemaname=ns.nspname
>      AND s.tablename = tbl.relname
>      AND s.inherited=false
>      AND s.attname=att.attname
>      WHERE att.attnum > 0 AND tbl.relkind='r'
>      GROUP BY 1,2
>    ) AS foo
>  ) AS rs
>  ON cc.relname = rs.relname AND nn.nspname = rs.nspname
>  LEFT JOIN pg_index i ON indrelid = cc.oid
>  LEFT JOIN pg_class_alive c2 ON c2.oid = i.indexrelid
> ```
>
> it runs in 10s, 20x faster (https://explain.depesz.com/s/K4SH)
>
> The rabbit hole probably goes deeper (e.g. should do the same for
> pg_statistic and pg_attribute and create a new pg_stats view).
>
> I am not able (at least not quickly) change the amount of temporary tables
> created or make the analytics queries finish quicker. Apart from the above
> hack of filtering out live tuples to a separate table is there anything I
> could do?
>
> Thank you,
> Marcin Gozdalik
>
> --
> Marcin Gozdalik
>

Reply via email to