On Thu, Dec 1, 2016 at 1:39 PM, Tom Lane <t...@sss.pgh.pa.us> wrote:
> Robert Haas <robertmh...@gmail.com> writes:
>> I think that the indexes only need to be scanned if the VACUUM finds
>> dead tuples. But even 1 dead tuple will cause a complete scan of
>> every index. I've complained about this before and I think there's
>> room for improvement here, but nobody's been motivated enough to
>> pursue this yet.
> The thing that's been speculated about in the past is having some
> threshold larger than 1 on the minimum number of dead tuples needed
> to cause a cleanup pass.
> It wouldn't be hard to implement, if you
> could get consensus on what the threshold should be.
> I'd think
> some algorithm similar to the autovacuum thresholds might be
> appropriate. It's not quite clear how this would interact with
> HOT pruning, though.
What's the relevance of HOT pruning here?
I was thinking that the relevant metric might be how many pages
contain dead tuples, because what we really want to do to reduce the
cost of future vacuuming and future index-only scans is get pages
marked all-visible. Say, if less than 2% of the pages in the table
contain dead tuples and the space required to store the TIDs is less
than 50% of maintenance_work_mem, skip the index scans. The first of
those thresholds, at least, would probably need to be configurable,
but that kind of idea.
The alternative that's been proposed is to do something based on the
number of dead tuples but, as somebody pointed out in a previous
discussion of this topic, one dead tuple per page throughout the whole
table is a LOT worse than same number of dead tuples all on the same
pages. You don't want to keep scanning large chunks of the heap
because you're too lazy to visit the indexes.
The Enterprise PostgreSQL Company
Sent via pgsql-hackers mailing list (email@example.com)
To make changes to your subscription: