Thanks every for your help. I am not familiar with the internal of the vacuum
freeze, just curious if there is no row change on the table(in other words, all
pages are frozen), why could index page have dead tuple?
is it possible to scan data page first, if all data page are frozen, skipping
the index page scan step. Perhaps there is other reason vacuum freeze does
index page first, then is it possible to provide a option to skip index page
scan step in vacuum freeze command? thanks
发件人: Robert Haas <robertmh...@gmail.com>
发送时间: 2016年12月1日 13:50:49
收件人: Tom Lane
抄送: xu jian; Masahiko Sawada; email@example.com
主题: Re: [HACKERS] Re: [HACKERS] 答复: [HACKERS] postgres 1 个(共 2 个) can pg 9.6
vacuum freeze skip page on index?
On Thu, Dec 1, 2016 at 1:39 PM, Tom Lane <t...@sss.pgh.pa.us> wrote:
> Robert Haas <robertmh...@gmail.com> writes:
>> I think that the indexes only need to be scanned if the VACUUM finds
>> dead tuples. But even 1 dead tuple will cause a complete scan of
>> every index. I've complained about this before and I think there's
>> room for improvement here, but nobody's been motivated enough to
>> pursue this yet.
> The thing that's been speculated about in the past is having some
> threshold larger than 1 on the minimum number of dead tuples needed
> to cause a cleanup pass.
> It wouldn't be hard to implement, if you
> could get consensus on what the threshold should be.
> I'd think
> some algorithm similar to the autovacuum thresholds might be
> appropriate. It's not quite clear how this would interact with
> HOT pruning, though.
What's the relevance of HOT pruning here?
I was thinking that the relevant metric might be how many pages
contain dead tuples, because what we really want to do to reduce the
cost of future vacuuming and future index-only scans is get pages
marked all-visible. Say, if less than 2% of the pages in the table
contain dead tuples and the space required to store the TIDs is less
than 50% of maintenance_work_mem, skip the index scans. The first of
those thresholds, at least, would probably need to be configurable,
but that kind of idea.
The alternative that's been proposed is to do something based on the
number of dead tuples but, as somebody pointed out in a previous
discussion of this topic, one dead tuple per page throughout the whole
table is a LOT worse than same number of dead tuples all on the same
pages. You don't want to keep scanning large chunks of the heap
because you're too lazy to visit the indexes.
The Enterprise PostgreSQL Company