Re: [HACKERS] GUC for cleanup indexes threshold.

Masahiko Sawada Mon, 25 Sep 2017 03:21:37 -0700

On Fri, Sep 22, 2017 at 5:31 PM, Kyotaro HORIGUCHI
<horiguchi.kyot...@lab.ntt.co.jp> wrote:
> At Fri, 22 Sep 2017 17:21:04 +0900, Masahiko Sawada <sawada.m...@gmail.com> 
> wrote in <cad21aobn9ucgmduinx2ptu8upetohnr-a35abcqyznlfvwd...@mail.gmail.com>
>> On Fri, Sep 22, 2017 at 4:16 PM, Kyotaro HORIGUCHI
>> <horiguchi.kyot...@lab.ntt.co.jp> wrote:
>> > At Fri, 22 Sep 2017 15:00:20 +0900, Masahiko Sawada 
>> > <sawada.m...@gmail.com> wrote in 
>> > <cad21aod6zgb1w6ps1axj0ccab_chdyiitntedpmhkefgg13...@mail.gmail.com>
>> >> On Tue, Sep 19, 2017 at 3:31 PM, Kyotaro HORIGUCHI
>> >> <horiguchi.kyot...@lab.ntt.co.jp> wrote:
>> >> Could you elaborate about this? For example in btree index, the index
>> >> cleanup skips to scan on the index scan if index_bulk_delete has been
>> >> called during vacuuming because stats != NULL. So I think we don't
>> >> need such a flag.
>> >
>> > The flag works so that successive two index full scans don't
>> > happen in a vacuum round. If any rows are fully deleted, just
>> > following btvacuumcleanup does nothing.
>> >
>> > I think what you wanted to solve here was the problem that
>> > index_vacuum_cleanup runs a full scan even if it ends with no
>> > actual work, when manual or anti-wraparound vacuums.  (I'm
>> > getting a bit confused on this..) It is caused by using the
>> > pointer "stats" as the flag to instruct to do that. If the
>> > stats-as-a-flag worked as expected, the GUC doesn't seem to be
>> > required.
>>
>> Hmm, my proposal is like that if a table doesn't changed since the
>> previous vacuum much we skip the cleaning up index.
>>
>> If the table has at least one garbage we do the lazy_vacuum_index and
>> then IndexBulkDeleteResutl is stored, which causes to skip doing the
>> btvacuumcleanup. On the other hand, if the table doesn't have any
>> garbage but some new tuples inserted since the previous vacuum, we
>> don't do the lazy_vacuum_index but do the lazy_cleanup_index. In this
>> case, we always do the lazy_cleanup_index (i.g, we do the full scan)
>> even if only one tuple is inserted. That's why I proposed a new GUC
>> parameter which allows us to skip the lazy_cleanup_index in the case.
>
> I think the problem raised in this thread is that the last index
> scan may leave dangling pages.
>
>> > Addition to that, as Simon and Peter pointed out
>> > index_bulk_delete can leave not-fully-removed pages (so-called
>> > half-dead pages and pages that are recyclable but not registered
>> > in FSM, AFAICS) in some cases mainly by RecentGlobalXmin
>> > interlock. In this case, just inhibiting cleanup scan by a
>> > threshold lets such dangling pages persist in the index. (I
>> > conldn't make such a many dangling pages, though..)
>> >
>> > The first patch in the mail (*1) does that. It seems having some
>> > bugs, though..
>> >
>> >
>> > Since the dangling pages persist until autovacuum decided to scan
>> > the belonging table again, we should run a vacuum round (or
>> > index_vacuum_cleanup itself) even having no dead rows if we want
>> > to clean up such pages within a certain period. The second patch
>> > doesn that.
>> >
>>
>> IIUC half-dead pages are not relevant to this proposal. The proposal
>> has two problems;
>>
>> * By skipping index cleanup we could leave recyclable pages that are
>> not marked as a recyclable.
>
> Yes.
>
>> * we stash an XID when a btree page is deleted, which is used to
>> determine when it's finally safe to recycle the page
>
> Is it a "problem" of this proposal?
>


As Peter explained before[1], the problem is that there is an XID
stored in dead btree pages that is used in the subsequent
RecentGlobalXmin interlock that determines if recycling is safe.

[1] 
https://www.postgresql.org/message-id/CAH2-Wz%3D1%3Dt5fcGGfarQGcAWBqaCh%2BdLMjpYCYHpEyzK8Qg6OrQ%40mail.gmail.com

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] GUC for cleanup indexes threshold.

Reply via email to