Re: Why doesn't pgstat_report_analyze() focus on not-all-visible-page dead tuple counts, specifically?

Robert Haas Mon, 06 Dec 2021 12:08:18 -0800

On Sun, Dec 5, 2021 at 12:28 AM Peter Geoghegan <p...@bowt.ie> wrote:
> I wonder why we're counting the number of dead tuples (or LP_DEAD stub
> items) in the relation as a whole in ANALYZE's acquire_sample_rows()
> function. Wouldn't it make more sense to focus on the "live vs dead
> tuple properties" of heap pages that are not known to be all-visible
> when we generate statistics for our pgstat_report_analyze() report?
> These statistic collector stats are only for the benefit of autovacuum
> scheduling -- and so they're *consumed* in a way that is totally
> different to the nearby pg_statistic stats.


I think this could be the right idea. I'm not certain that it is, but
it does sound believable.

> This new approach also buys us the ability to extrapolate a new
> estimated number of dead tuples using old, stale stats. The stats can
> be combined with the authoritative/known number of not-all-visible
> pages right this second, since it's cheap enough to *accurately*
> determine the total number of not-all-visible pages for a heap
> relation by calling visibilitymap_count(). My guess is that this would
> be much more accurate in practice: provided the original average
> number of dead/live tuples (tuples per not-all-visible block) was
> still reasonably accurate, the extrapolated "total dead tuples right
> now" values would also be accurate.

So does this. If some of the table is now all-visible when it wasn't
before, it's certainly a good guess that the portions that still
aren't have about the same distribution of dead tuples that they did
before ... although the other direction is less clear: it seems
possible that newly not-all-visible pages have fewer dead tuples than
ones which have been not-all-visible for a while. But you have to make
some guess.

-- 
Robert Haas
EDB: http://www.enterprisedb.com

Re: Why doesn't pgstat_report_analyze() focus on not-all-visible-page dead tuple counts, specifically?

Reply via email to