On Sat, Mar 4, 2017 at 5:59 AM, Peter Geoghegan <p...@bowt.ie> wrote:
> On Fri, Mar 3, 2017 at 2:41 PM, Peter Geoghegan <p...@bowt.ie> wrote:
>> In other words, the number of B-Tree pages that the last VACUUM
>> deleted, and thus made eligible to recycle by the next VACUUM has no
>> relationship with the number of pages the next VACUUM will itself end
>> up deleting, in general, or how long it will be before that next
>> VACUUM comes, if it comes at all, or anything else that seems at all
>> relevant.
> This raises another question, though: Why have this GUC at all? Why
> use *any* threshold that is to be compared against the number of heap
> pages that were processed by VACUUM this time?
> B-Tree page deletion isn't really part of the ordinary life cycle of a
> B-Tree index. In order for that to be the case, somebody would have to
> delete large ranges of indexed values (typically hundreds of logically
> contiguous values -- no gaps), without anyone else ever inserting new
> tuples that are in the same range before the next VACUUM. It's very
> unlikely that this would happen again and again in the real world. So,
> even if we never freeze, the number of B-Tree pages that we delete
> when we VACUUM today is generally a useless predictor of how many will
> be deleted by a VACUUM that occurs tomorrow. This is true despite the
> fact that the number of dead heap tuples is probably almost identical
> for each VACUUM (or the number of heap pages that end up being
> processed by VACUUM, if you prefer).
> Barring any concerns about crash safety, we can be completely certain
> that any "recycling-orientated B-Tree VACUUM" (a btvacuumcleanup()
> call to btvacuumscan(), which happens because there are no tuples in
> the index to kill) will end up recycling however many pages the last
> VACUUM managed to delete, which is a precisely knowable number (or
> could be made knowable if we stashed that number somewhere, like the
> meta-page). It will typically only take seconds or minutes after the
> VACUUM finishes for its RecentGlobalXmin interlock to stop being a
> problem (that is, for _bt_page_recyclable() to return "true" for any
> pages that that VACUUM deleted). From that point on, those deleted
> pages are "money in the bank" for the FSM. The only reason why we'd
> want to tie "the FSM withdrawing that money" to VACUUM is because that
> might be needed to clean up regular bloat anyway.
> The test performed by this patch within lazy_scan_heap(), to determine
> whether we should avoid calling lazy_cleanup_index() would therefore
> look like this, ideally: Do I want to go to the trouble of scanning
> this index (a cost that is proportionate to the size of the index) in
> order to recycle this number of known-deleted pages (a benefit that is
> proportionate to that number)? (I still think that the important thing
> is that we don't let the number of unclaimed-by-FSM recyclable pages
> grow forever, though.)

You are right that we don't want the number of unclaimed-by-FSM
recyclable pages to grow forever, but I think that won't happen with
this patch.  As soon as there are more deletions (in heap), in the
next vacuum cycle, the pages will be reclaimed by lazy_vacuum_index().

> (Thinks about it some more...)
> Unfortunately, I just saw a whole new problem with this patch:
> _bt_page_recyclable() is the one place in the B-Tree AM where we stash
> an XID.

Can you be more specific to tell which code exactly you are referring here?

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:

Reply via email to