Re: New GUC autovacuum_max_threshold ?

Nathan Bossart Thu, 09 Jan 2025 10:06:48 -0800

On Wed, Jan 08, 2025 at 07:01:53PM -0500, Robert Treat wrote:
> To be frank, this patch feels like a solution in search of a problem,
> and as I read back through the thread, it isn't clear what problem
> this is intended to fix.


Thanks for sharing your thoughts.  FWIW I've heard various strategies over
the years for ensuring large tables are vacuumed more often, including
per-table settings for autovacuum_vacuum_threshold, etc.  At its core, this
patch is intended to ensure larger tables are handled by default.  But I do
think it's also important to consider _why_ folks want to vacuum larger
tables more often.  More on that below...

> Is the patch supposed to help with wraparound prevention?
> autovac_freeze_max_age already covers that, and when it doesn't
> vacuum_failsafe_age helps out.

Not really, although I certainly don't think it hurts matters in that
department.  In any case, while autovacuum_freeze_max_age and
vacuum_failsafe_age are incredibly important backstops for wraparound
issues, it's probably not great to rely on them too much for bloat, etc.

> A couple of people mentioned issues around hitting the index wall when
> vacuuming large tables, but we believe that problem is mostly resolved
> due to radix based tid storage, so this doesn't solve that. (To the
> degree you don't think v17 has baked into enough production workloads
> to be sure, I'd agree, but that's also an argument against doing more
> work that might not be needed)

Agreed.

> Maybe the hope is that this setting will cause vacuum to run more
> often to help ameliorate i/o work from freeze vacuums kicking in, but
> I suspect that Melanie's nearby work on eager vacuuming is a smarter
> solution towards this problem (warning, it also may want to add more
> gucs), so I think we're not solving that, and in fact might be
> undercutting it.

I haven't paid enough attention to the eager freezing work to have an
opinion on this point.

> I guess that means this is supposed to help with bloat management? but
> only on large tables? I guess because you run vacuums more often?

Right.  I think Robert Haas explained it well [0] [1].

> Except that the adages of running vacuums more often don't apply as
> cleanly to large tables, because those tables typically come with
> large indexes, and while we have a lot of machinery in place to help
> with repeated scans of the heap, that same machinery doesn't exist for
> scanning the indexes, which gives you sort of an exponential curve
> around vacuum times as table size (but actually index size) grows
> larger. On the upside, this does mean we're less likely to see a 50x
> boost in vacuums on large tables that some seemed concerned about, but
> on the downside its because we're probably going to increase the
> probability of vacuum worker starvation.

IIUC your concern is that instead of incurring one gigantic vacuum every
once in a while, we are incurring multiple medium vacuums more often, to
the point that we are spending significantly more time vacuuming a table
than before.  Is that right?  If so, I'm curious what you think about the
discussion upthread on this point [2].

> But getting back to goals, if your goal is to help with bloat
> management, trying to tie that to a number that doesn't cleanly map to
> the meta information of the table in question is a poor way to do it.
> Meaning, to the degree that you are skeptical that vacuuming based on
> 20% of the rows of a table might not really be 20% of the size of the
> table, it's certainly going to be a closer map than 100million rows in
> a n number of tables of unknown (but presumably greater than
> 500million?) numbers of rows of unknown sizes. And again, we have a
> means to tackle these bloat cases already; lowering
> vacuum_scale_factor.

I disagree on this point.  I think the fact that folks are forced to make
per-table adjustments to parameters like vacuum_scale_factor and are
participating in vigorous discussions like this one indicates that the
existing system isn't sufficient (or at least isn't sufficient by default).
That's not to say that adding a hard cap is perfect, either, but I don't
think we should let perfect be the enemy of good, especially not at this
stage of v18 development.

[0] 
https://postgr.es/m/CA%2BTgmoY4BENJyYcnU2eLFYZ73MZOb72WymNBH7vug6DtA%2BCZZw%40mail.gmail.com
[1] https://youtu.be/RfTD-Twpvac?&t=1979
[2] https://postgr.es/m/20240507211702.GA2720371%40nathanxps13

-- 
nathan

Re: New GUC autovacuum_max_threshold ?

Reply via email to