Kadri Raudsepp <kadri.rauds...@nordicgaming.com> writes:
> I have set up a cron-job that sends me daily reports on bloat amount in
> tables and indices, which I calculate using pg_stats, not pgstattuple, for
> performance and I/O reasons. If the bloat amount or percentage are big
> enough, I use pg_repack to get rid of it. At some point I noticed, that
> some tables keep showing up in the reports with the same amount of bloat,
> which pg_repack was seemingly unable to remove. Investigation showed that
> pgstattuple gave very different results than my bloat-finding query.
> Reason - for some tables there are some columns that never show up in
> pg_statistics.

Hmm.  Eyeballing the ANALYZE code, I note that it will decide that it
hasn't got any valid statistics for a column if (1) it finds no NULL
values and (2) every single sampled value in the column is too wide
(more than WIDTH_THRESHOLD = 1024 bytes wide).  Does this describe your
problematic column?

It seems like the code is being too conservative here --- it could at
least generate valid values for stanullfrac and stawidth.  I'm inclined
to think maybe it should also set stadistinct = -1 ("unique") in this
case, since the basic assumption that validates ignoring very wide
values is that they aren't duplicates.

                        regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to