David Gould <da...@sonic.net> writes:
> On Sun, 4 Mar 2018 07:49:46 -0800
> Jeff Janes <jeff.ja...@gmail.com> wrote:
>> I don't see how it could have caused the problem in the first place.  In
>> your demonstration case, you had to turn off autovac in order to get it to
>> happen, and then when autovac is turned back on, it is all primed for an
>> autovac to launch, go through, touch almost all of the pages, and fix it
>> for you.  How did your original table get into a state where this wouldn't
>> happen?

> One more way for this to happen, vacuum was including the dead tuples in the
> estimate in addition to the live tuples.

FWIW, I've been continuing to think about this and poke at your example,
and I am having the same difficulty as Jeff.  While it's clear that if you
managed to get into a state with wildly inflated reltuples, ANALYZE would
fail to get out of it, it's not clear how you could get to such a state
without additional contributing factors.  This ANALYZE behavior only seems
to result in an incremental increase in reltuples per run, and so that
shouldn't prevent autovacuum from happening and fixing the estimate ---
maybe not as quickly as it should happen, but it'd happen.

The reasons I'm harping on this are (a) if there are additional bugs
contributing to the problem, we need to identify them and fix them;
(b) we need to understand what the triggering conditions are in some
detail, so that we can decide whether this bug is bad enough to justify
back-patching a behavioral change.  I remain concerned that the proposed
fix is too simplistic and will have some unforeseen side-effects, so
I'd really rather just put it in HEAD and let it go through a beta test
cycle before it gets loosed on the world.

Another issue I have after thinking more is that we need to consider
what should happen during a combined VACUUM+ANALYZE.  In this situation,
with the proposed patch, we'd overwrite VACUUM's result with an estimate
derived from ANALYZE's sample ... even though VACUUM's result might've
come from a full-table scan and hence be exact.  In the existing code
a small ANALYZE sample can't do a lot of damage to VACUUM's result, but
that would no longer be true with this.  I'm inclined to feel that we
should trust VACUUM's result for reltuples more than ANALYZE's, on the
grounds that if there actually was any large change in reltuples, VACUUM
would have looked at most of the pages and hence have a more reliable
number.  Hence I think we should skip the pg_class update for ANALYZE if
it's in a combined VACUUM+ANALYZE, at least unless ANALYZE looked at all
(most of?) the pages.  That could be mechanized with something like

-       if (!inh)
+       if (!inh && !(options & VACOPT_VACUUM))

controlling do_analyze_rel's call to vac_update_relstats, maybe with a
check on scanned_pages vs total_pages.  Perhaps the call to
pgstat_report_analyze needs to be filtered similarly (or maybe we still
want to report that an analyze happened, but somehow tell the stats
collector not to change its counts?)

Also, as a stylistic matter, I'd be inclined not to touch
vac_estimate_reltuples' behavior.  The place where the rubber is meeting
the road is

    *totalrows = vac_estimate_reltuples(onerel, true,
    if (bs.m > 0)
        *totaldeadrows = floor((deadrows / bs.m) * totalblocks + 0.5);
        *totaldeadrows = 0.0;

and it seems to me it'd make more sense to abandon the use of
vac_estimate_reltuples entirely, and just calculate totalrows in a fashion
exactly parallel to totaldeadrows.  (I think that's how the code used to
look here ...)

In HEAD, we could then drop vac_estimate_reltuples' is_analyze argument
altogether, though that would be unwise in the back branches (if we
back-patch) since we have no idea whether any third party code is calling
this function.

                        regards, tom lane

Reply via email to