On Tue, Aug 26, 2014 at 12:19 PM, Alvaro Herrera
<alvhe...@2ndquadrant.com> wrote:
>> >On Mon, May 5, 2014 at 11:57 AM, Mark Kirkwood
>> ><mark.kirkw...@catalyst.net.nz> wrote:
>> >I could think of 2 ways to change this:
>> >
>> >a. if user has specified cost_limit value for table, then it just uses it
>> >     rather than rebalancing based on value of system-wide guc variable
>> >     autovacuum_vacuum_cost_limit
>> >b. another could be to restrict setting per-table value to be lesser than
>> >     system-wide value?
>> >
>> >The former is used for auto vacuum parameters like scale_factor and
>> >later is used for parameters like freeze_max_age.
>> >
> I've been giving some thought to this.  Really, there is no way to
> handle this sensibly while at the same time keeping the documented
> behavior -- or in other words, what we have documented is not useful
> behavior.  Your option (b) above is an easy solution to the problem,
> however it means that the user will have serious trouble configuring the
> system in scenarios such as volatile tables, as Mark says -- essentially
> that will foreclose the option of using autovacuum for them.
>
> I'm not sure I like your (a) proposal much better.  One problem there is
> that if you set the values for a table to be exactly the same values as
> in postgresql.conf, it will behave completely different because it will
> not participate in balancing.  To me this seems to violate POLA.
>
> So my proposal is a bit more complicated.  First we introduce the notion
> of a single number, to enable sorting and computations: the "delay
> equivalent", ...

I favor option (a).   There's something to be said for your proposal
in terms of logical consistency with what we have now, but to be
honest I'm not sure it's the behavior anyone wants (I would welcome
more feedback on what people actually want).  I think we should view
an attempt to set a limit for a particular table as a way to control
the rate at which that table is vacuumed - period.

At least in situations that I've encountered, it's typical to be able
to determine the frequency with which a given table needs to be
vacuumed to avoid runaway bloat, and from that you can work backwards
to figure out how fast you must process it in MB/s, and from there you
can work backwards to figure out what cost delay will achieve that
effect.  But if the system tinkers with the cost delay under the hood,
then you're vacuuming at a different (slower) rate and, of course, the
table bloats.

Now, in the case where you are setting an overall limit, there is at
least an argument to be made that you can determine the overall rate
of autovacuum-induced I/O activity that the system can tolerate, and
set your limits to stay within that budget, and then let the system
decide how to divide that I/O up between workers.  But if you're
overriding a per-table limit, I don't really see how that holds any
water.  The system I/O budget doesn't go up just because one
particular table is being vacuumed rather than any other.  The only
plausible use case for setting a per-table rate that I can see is when
you actually want the system to use that exact rate for that
particular table.

I might be missing something, of course.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to